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ABSTRACT 

This  final  report  analyzes  results  of  a 
comprehensive,  integrative  review  of  273  primary  research  studies  on 
modifying  attitudes  toward  disabled  persons.  Attitude  modification 
approaches  identified  included  providing  information,  direct  contact 
with  disabled  persons,  vicarious  experiences  related  to  having  a 
disability,  systematic  desensitization,  positive  reinforcement,  and 
combinations  of  these  approaches.  The  primary  analytic  focus  was  on 
mean  effect  sizes:  644  effect  sizes,  obtained  from  200  reports, 
constituted  the  main  data  set.  Outcomes  were  not  related  to  three 
basic  indicators  of  study  quality  (general  treatment  validity, 
general  internal  validity,  and  adequacy  of  test  validity). 
Methodological  quality  was  not  high,  treatment  effects  were  moderate 
and  heterogeneous,  and  treatment  techniques  were  heterogeneous  in 
characteristics  and  outcomes.  The  studies  differed  on  such 
characteristics  as  type  of  comparison,  time  of  posttest,  and  type  of 
dependent  measure.  Alternative  and  supplementary  data  explorations 
did  not  produce  results  markedly  different  from  the  anc  .yses  of 
individual  effect  sizes.  Among  the  areas  of  needed  improvement  In 
research  methodology  are  definition  of  the  construct  of  '^attitude 
toward  persons  with  disabilities,"*  selection  of  assessment  techniques 
that  yield  reliable  and  valid  scores,  attention  to  treatment  validity 
(including  more  careful  reporting  of  procedures  and  design  elements), 
and  the  development  of  programmatic,  replication-oriented  research. 
(Author/JW) 


*  Reproductions  supplied  by  EDRS  are  the  best  that  can  be  made  * 

*  from  the  original  document.  * 
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EXECUTIVE  SUMMARY 

The  Modification  of  Attitudes  Toward  Persons 
With  Handicaps:    A  Comprehensive  Integrative  Review  of  Research 

James  P.  Shaver,  Cnarles  K.  Curtis, 
Joseph  JesunathadaSf  and  Carol  J.  Strong 

Changing  Negative  Attitudes 
Uds  Been  A  Research  Interest 

Legislation  and  judicial  decisions  are  bringing  handicapped  persons  into 

the  mainstream  of  educational,  social,  and  economic  life  in  this  society. 

Nevertheless,  negative  attitudes  toward  persons  with  disabixities  continue  to 

be  detrimental  to  their  potential  to  live  dignified,   productive  lives  and  to 

contribute  to  society.     A  major  research  interest  has  Leen  how  to  modify  the 

negative  attitudes  and  thereby  mitigate  the  effects   on   persons  with 

disabilities. 

Prior  Reviews  Have  Not  Been 
Comprehensive  or  Quantitative 

Despite  the  availability  of  much  larger  numbers  of  reports  of  research 

on  modifying  attitudes  toward  disabled  persons,    prior  reviews  of  that 

research  have  typically  citjd  only  32  to  36  studies.     Moreover,   an  analysis 

of  seven  full  reviews  and  eight  brief  reviews  indicated  that  they  suffered 

from  e  number  of  shortcomings.    For  example,  methods  of  locating  reports  and 

criteria  for  including  research  reports  in  the  review  were  not  reported; 

there  was  a  lack  of  systematic  data  collection  and  analysis  as  a  basis  for 

conclusions  about  the  effectiveness  of  methods  of  attitude  change;    and,  in 

almost  all  of  the  reviews,   the  moderating  effects  of  other  variables — 

variations  in  treatments,    other   study  characteristics,    and  sample 

characteristics — were  not  addressed. 
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As  a  result  of  the  lack  of  comprehensiveness  and  the  narrative  approach 
of  the  prior  reviews,  it  could  not  be  discerned  whether  the  conclusions 
generally  drawn — i.e./  that  the  effects  of  attitude  modification  techniques 
were  negligible  and  that  both  negative  and  positive  results  were  often 
produced — reflected  inadequacies  in  the  review  procedures  or  were  accurate 
depictions  of  the  research  literature. 

This  Review  Was  OoofHrehensive 
and  Quantitative 

This  integrative  review  was  based  on  all  reports  of  relevant  research 
that  could  be  located  through  computer  and  hand  searches  of  commonly  used 
indexes/  and  in  the  reference  lists  for  prior  reviews  and  primary  research 
reports.  A  coding  instrument  was  used  to  collect  data  on  sample 
characteristics/  the  attitude  modification  techniques  investigated  and  their 
characteristics/  design  and  instrumentation/  including  indicators  of 
methodological  quality/  and  effect  sizes.  The  basic  effect  size  was  Delta 
(symbolized  as  £):  the  mean  of  the  experimental  group  minus  the  mean  of  the 
control  group/  divided  by  the  control  group  standard  deviation 
=  Xg  -  X^/SDq).  Correlation  coefficients  were  also  used/  and  variance 
ratios — the  posttest  variance  of  the  experimental  group  divided  by  the 
posttest  variance  of  the  comparison  group — were  computed  for  exploratory 
analyses. 

644  Effect  Sizes  from  200  Studies 
Were  the  Main  Data  Set 

The  number  of  primary  research  studies  coded  for  the  integrative  review 

was  273  (based  on  303  reports/   some  ol  which  were  about  the  same  research). 

Two  hundred  studies  involving  treatment  versus  control/  treatment  versus 

placebo/  or  single-group/  pre-posttest  comparisons  for  which  data  were 
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available  to  compute  effect  sizes  made  up  the  population  for  the  main 
analyses  of  data  for  the  review.  Supplementary  analyses  were  conducted  on 
effect  sizes  from  15  studies  in  which  alternative  treatments/  rather  than 
treatment  versus  the  absence  of  treatment/  were  compared/  and  from  nine 
reports  of  the  effects  of  mainstreaming  students  with  disabilities  in  regular 
classrooms.  In  addition/  the  results  from  55  studies  for  which  statistical 
significance/  but  not  the  information  to  compute  effect  sizes/  was  reported/ 
were  also  coded  for  a  supplementary  analysis  to  determine  whether  that 
population  of  studies  differ^^d  from  our  main  population  of  200  studies. 

All  told;  644  effect  sizes,  obtained  from  200  reports/  constituted  our 
main  data  set.  In  addition/  51  effect  si-^es  from  Treatment  A  vs.  Treatment  B 
comparisons/  18  effect  sizes  from  mainstreaming  studies/  and  182  results  for 
which  effect  size  information  was  missing  were  obtained  for  analysis. 

Inferential  Statistics  Were  Not  Osed 

Because  the  seaLCh  for  reports  was  comprehensive/  including  efforts  to 
obtain  information  for  computing  effect  sizes  when  it  was  not  available  in 
the  report/  the  studies  upon  which  data  were  collected  were  considered  a 
population  rather  than  a  sample.  Consequently/  inferential  statistics  were 
not  used  in  analyses. 

The  primary  analytic  focus  was  on  mean  effect  sizes.  Standard 
deviations  were  also  reported/  along  with  Eta'^^s  as  appropriate  to  indicate 
the  proportion  of  the  variance  in  Ds  associated  with  the  particular 
categories  included  in  the  analyses.  Data  were  organized  in  one-way  and  two- 
way  tables/  and  differences  were  scrutinized  using  as  the  criterion  for 
triviality  any  difference  between  mean  Ds  that  was  less  than  .12.  Pearson 
product-moment  correlations  were  ilso  run  on  some  of  the  data. 
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Outcooies  Were  Not  Related  to  Study  Quality 

Initial  analyses  indicated  that  outcomes  were  not  related  to  three  basic 
indicators  of  study  quality — general  treatment  validity,  general  internal 
validity,  and  adequacy  of  test  validity.  Thai  lack  of  relationship  was  at 
xeast  in  part  because  few  studies  were  rated  high  (on  a  three-point  scale)  on 
any  of  the  indicators  of  quality. 

Anc:lyses  also  indicated  that,  as  has  been  the  case  in  other  quantitative 
integrative  reviews,  the  mean  £  for  journal  articles  was  higher  than  that  for 
other  types  of  research  reports.  By  the  same  token,  the  mean  £  for  single^ 
group,  pre-posttest  comparisons  was  higher  than  that  for  treatment  versus 
control  and  treatment  versus  placebo  comparisons.  However,  there  was  no 
indication  that  3ither  type  of  report  or  type  of  comparison  was  unevenly 
distributed  among  the  various  attitude  modification  techniques  which  had  been 
investigated. 

nethodological  Quality  Was  nOt  High 

The  predominantly  low  to  moderate  ratings  of  quality  reflected  a  variety 
of  methodological  weaknesses  in  the  studies,  including  failures  to  verify  the 
implementation  of  the  independent  variable,  low  ratings  on  actual  treatment 
implementation,  low  ratings  on  various  indicators  of  treatment  and  internal 
validity,  and  a  lack  of  information  in  many  cases  to  determine  whether  or  not 
^here  were  threats  to  treatment  or  internal  validity.  There  also  was  a  lack 
of  replications.  Perhaps  the  most  imp'^rtant  conclusion  drawn  from  the 
integrative  review  was  that  the  methodological  quality  of  research  in  this 
field  is  not  high,  and  future  efforts  must  address  both  the  more  adequate 
design  of  individual  studies  and  the  need  for  programs  of  research  which 
include  the  replication  of  studies  to  determine  the  reliability  and 
generalizable  of  results. 


Despite  the  inability  to  determine  whether  results  would  have  been 
different  for  high  quality  studies,  the  integrative  review  was  car  .ed  out, 
based  on  the  recognition  that  moderate  and  low  quality  studies  constituted 
the  best  evidence  available,  but  that  interpretations  niust  be  made 
cautiously. 

Treatment  Effects  Were  Moderate  and 
Heterogeneous 

Ten  attitude  modification  approaches  were  identified:  providing 
information  about  disabilities  and  persons  with  disabilicies;  direct  contact 
with  persons  with  disabilities;  providing  situations  for  vicarious 
experiences  related  to  having  a  disability;  the  use  of  systematic 
desensitization  to  extinguish  negative  attitudes;  the  use  of  positive 
reinforcement  to  modify  attitudes;  information  plus  direct  contact) 
information  plus  vicarious  experiences;  and  a  category  of  "other"  which 
encompassed  other  combinations  of  the  first  eight  techniques.  The  tenth 
category  was  for  studies  that  contrasted  different  types  of  persuasive 
messages.  For  two  of  the  ten  techniques — positive  reinforcement  and 
persuasive  message,  contrast — so  few  effect  sizes  were  available  that  they 
were  largely  excluded  from  consideration. 

The  mean  £  for  the  644  comparisons  in  our  main  data  set  was  *37,  a 
moderate  level  of  effect.  The  treatment  techniques  were  ranked  by  size  of 
mean  £  in  the  following  order:  Persuasive  Messages,  'nean  D  =  .67  (N  =  23); 
Information  Plus  Contact,  mean  D  =  .51  (N  =  100);  Direct  Contact,  .mean  £ 
=  .43  (N  =  93);  Vicarious  Experiences,  mean  D  =  .40  (N  =  58);  Other,  mean  D 
=  .39  (ISI  =  71);  Systematic  Desensitization,  mean  £  =  .32  (N  =  21), 
Information  mean  £  =  .29  (N  =  203);  and.  Information  Plus  Vicarious 
Experiences,  mean  £  =  .20  (N  =  62).    Considerable  variability  in  outcomes 
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was  indicated  by  the  overall  standard  deviation  for  Ds  of  .61,  along  with 
standard  deviations  for  the  eight  attitude  modification  techniques  which 
ranged  from  .36  to  .76f  with  six  of  them  .50  or  above. 

Clearly,  although  it  was  possible  to  rank  order  the  techniques  by  their 
mean  Ds,  there  was  much  heterogeneity,  with  the  distribution  of  £s  for  the 
various  techniques  overlapping  one  another.  Moreover,  for  each  technique, 
there  were  negative  Ds  (the  treatment  group  gain  was  less  than  that  for  the 
comparison  group).  And,  there  were  instances  in  which  the  change  for  the 
treatment  group  was  negative  (even  though  the  £  may  have  been  positive 
because  the  comparison  group  had  a  more  negative  change). 

There  Were  Also  Variations  in 
Treatment  Features 

Variations  occurred  in  such  treatment  features  as  the  disabilities 

toward  which  attitude  modification  efforts  were  directed;  the  types  of 

information  and  information  delivery  modes  used,  whether  information  gain  was 

assessed  and,   if  so,   the  degree  of  gain,   in  investigations  of  information  as 

an  attitude  modification  technique;  the  types  of  experiences  provided  in 

vicarious  experience  studies;  the  types  of  persuasive  messages  and  how  they 

were  presented;  and/  the  types  of  contact  and  the  situations  in  which  it 

occurred,  the  characteristics  and  relative  status  of  the  disabled  persons, 

the  presence  of  institucional  support,   and  the  competencies  of  the  disabled 

persons,  in  contact  studies.    However,  coding  to  determine  the  extent  to 

which  contact  studies  reflected  factors  considered  theoratically  important 

for  the  modification  or  attitudes  was  not  fruitful  because  many  reports 

contained  inadequate  informatJon  for  scoring  and,  when  information  was 

available,  effect  sizes  were  often  clustered  in  only  one  or  tv/o  categories. 

In  short,  the  various  treatment  techniques  were  hetetogenecus  in  terms  of 

features,  as  well  as  ouccomes. 
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studies  Also  Differed  in  Other  Ways 

Other  characteristics  on  which  the  studies  differed  included  type  of 
comparison  (i.e./  whether  treatment  versus  control/  treatment  versus  placebo/ 
or  single^group,  pre«posttest)/  time  of  posttest/  type  of  dependent  measure, 
length  of  treatment/  the  context  for  the  treatment  (e.g./  whether  an 
elementary-secondary  school  or  college-university  environment)/  setting  (the 
specific  plack:  where  the  research  was  conducted — e.q,  classroom  or 
laboratory),  and  sample  size.  Occasionally/  there  were  different 
relationships  between  Ds  and  concommitant  variables  across  treatments;  but 
treatments  were  sometimes  nested  within  study  characteristics  and  Ds  were 
lacking  for  some  treatment-characteristic  combinations,  so  drawing 
conclusions  about  relative  effects  was  difficult. 

There  were  also  differences  in  samples/  such  as  the  methods  of 
selection  and  the  grade-age  levels  and  gender  of  the  Ss.  Not  only  did  the 
studies  vary  on  these  sample  characteristics,  but  there  were  some 
differential  effects  across  treatments.  However,  with  the  nesting  of 
treatments  and  small  Ms  or  empty  cells/  relations  to  treatment  outcomes  were 
not  clear.  Very  few  reports  contained  any  information  on  interactions 
between  Ss*  prior  contact  with  persons  with  disabilities  or  Ss*  personality 
traits  and  treatment  outcomes. 

Alternatives  to  the  Analysis  of  Individual  Ds  Were 
Explored  and  SupplCTientary  Analyses  Carried  Out 

Alternative  and  supplementary  data  explorations  included  analyses  based 

on  a  median  effect  size  for  each  study  rather  than  individual  effect  sizes, 

and  analyses  with  outliers  (Ds  greater  than  two  standard  deviations  from  the 

mean)  excluded.     IQeither  of  these   analyses  produced  results  markedly 

different  from  the  analyses  of  individual  effect  sizes. 
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In  addition/  Ds  from  Treatment  A  vs.  Tceatment  B  comparisons  (i.e./ 
direct  comparisons  of  different  treatment  techniques)  and  Mainstreaming 
studies  were  analyzed  separately.  However/  there  were  so  few  effect  sizes  in 
either  category  that  little  information  was  gained. 

Also/  studies  for  which  the  statistical  significance  of  results  was 
reported/  but  information  to  compute  effect  sizes  was  no^  available/  were 
coded  and  frequency  data  analyzed.  The  Missing  Information  and  effect  size 
studies  constituted  different  populations.  For  example/  the  Missing 
Information  studies  were  lik(?ly  to  be  of  somewhat  lower  quality  and  to  have 
been  condujted  in  different  contexts  (e.g./  a  higher  percentage  of  inser'dce- 
education  and  work  contexts)  than  were  t'lose  for  which  £s  could  be  computed. 
Different  attitude  modification  techniques  also  were  represented  in  the 
Missing  Information  studies. 

Differences  in  the  posttest  variances  of  treatment  and  comparison  groups 
were  also  explored.  Overall/  treatment  groups  were  slightly  more  variable  at 
posttest  than  were  the  control  groups  (mean  variance  ratio  =  1.13).  It  was 
postulated  that  it  would  be  desirable  for  treatment  groups  to  have  higher 
posttest  means  (more  positive  attitudes)  and  lower  posttest  variabilities 
than  their  comparison  groups.  However/  the  relationships  between  posttest  Ds 
and  variance  ratios  for  different  treatments  were  not  consistent.  T'.eatment 
effects  on  variability  appear  to  be  a  candidate  for  further  attention  in 
primary  research  studies  and  in  integrative  reviews. 

Conclusions  Were  Drawn  About  Outcomes/ 
Research  Qualityf  and  Attitude  Change  Research 

This  comprehensive  integrative  review  confirmed  the  conclusions  in  prior 

reviews  that  the  effects  of  attitude  modification  techniques  are  often  not 

large  and  frequently  are  contradictory. 
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It  was  also  concluded  that  the  general  quality  of  research  in  the 
modification  of  attitudes  toward  persons  with  disabilities  has  not  been  high. 
Among  the  areas  of  needed  improvement  in  methodology  are  definition  of  the 
construct  of  "attitude  toward  persons  with  disabilities"/  the  selection  of 
assessment  techniques  that  yield  reliable  and  valid  scores/  attention  to 
treatment  validity/  including  verification  of  the  independent  variable/  the 
reduction  of  threats  to  internal  validity/  more  careful  reporting  of 
procedures  and  design  elements/  and/  most  important  of  all/  the  development 
of  programmatic/  replication-oriented  research. 

It  was  noted  that  designing  and  conducting  studies  with  perfect  internal 
validity  is  extremely  difficult  in  an  applied  field  such  as  attitude 
modification/  and  one  major  flaw  will  invalidate  an  otherwise  valid  study. 
Moreover/  complex  iiiteractions  between  variables  may  account  for  the  moderate 
and  contradictory  results  from  primary  research  studies.  How  to  modify 
attitudes  toward  persons  with  disabilities  should  be  a  continuing  item  on  the 
research  agenda.  However/  how  to  channel  the  behavior  of  individuals  who  have 
negative  attitudes  so  as  to  avoid  the  restrictive  and  dehumanizing  effects  on 
persons  with  disabilities  should  continue  to  be  both  a  policy  and  research 
focus.  We  ought  not  simply  assume  that  research  will  someday  tell  us  how  to 
obliterate  negative  attitudes. 
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CHAPTER  1 

WHY  ANOTHER  REVIEW  OF  THE  LITERATURE 

In  the  last  two  decades/  much  progress  has  been  made  toward  assuring 
equity  in  education,  housing,  and  employment  for  handicapped  persons.  The 
Architectual  Barriers  Act  of  1968,  Section  504  of  the  Rehabilitation  Act  of 
1973,  Public  Law  94-142,  and  recent  court  decisions  supporting  the  rights  of 
disabled  persons  have  signaled  a  marked  shift  in  public  policy  in  regard  to 
handicapped  populations.  Among  other  changes  that  have  resulted,  increased 
mainstreaming  in  public  schools  and  reductions  of  physical  barriers  have  been 
major  steps  toward  bringing  handicapped  persons  into  full  participation  in 
the  educational,  social,  and  economic  life  of  the  society.  Yet,  despite 
these  auspicious  beginnings,  a  major  barrier  to  equal  opportunities  for 
handicapped  persons  remains — the  attitudes  of  nondisabled  persons  toward 
them. 

Bogdan  and  Biklen  (1977)  used  the  term  "handicapism"  to  emphasize  the 
similarity  in  the  situations  of  handicapped  persons,  ethnic  minorities,  and 
women.  All  may  suffer  from  stereotyping  and  prejudice.  As  with  racism  and 
sexism,  handicapism  refers  to  sets  of  assumptions  and  practices  which  reflect 
attitudes  and  lead  to  "differential  and  unequal  treatment",  based  on 
"apparent  or  assumed  physical,  mental,  or  behavioral  differences"  (p.  14). 
Writings  by  Bowe  (1978,  1980),  Cohen  (1977),  and  Kleinfield  (1979)  hcve 
illustrated  well  how  unfounded,  negative  attitudes  of  other  persons  often 
limit  the  opportunities  of  disabled  individuals,  thus  handicapping  them. 


Examples  include  educators  who  underestimate  the  potential  of  disabled* 
students,  employers  vho  lack  confidence  in  the  abilities  and  motivation  of 
disabled  potential  employees,  and  nondisabled  persons  who  hesitate  to 
socialize  with  those  who  are  disabled.  It  is  of  equal,  if  not  greater, 
consequence  that  the  result  can  be  loss  of  self--esteem  ard  the  self-limiting 
of  options  by  disabled  persons,  as  they  actopt  the  attitudes  of  others  toward 
themselves. 

Changing  attitudes  toward  disabled  persons  to  eradicate  the  effects  of 

handicapism  is  not  simply  a  matter  for  direct  legislative  or  judicial  action. 

As  Itzhak  Perl man,  the  world  renowned  violinist  whose  access  to  some  concert 

halls  is  impeded  because  of  his  physical  disability,  expressed  it: 

There  is  a  real  problem  in  society.  Certainly  I  would  like  to  see 
public  transportation  and  architecture  that  everyone  can  use  without 
difficulty.  But  what  I  most  want  funds  can't  buy,  and  that's  a  change 
in  attitudes  toward  the  disabled  so  that  laws  don't  have  to  be  enforced. 
(Salt  Lake  Tribune,  10/3/84,  p.  2A). 

Laws  that  bring  handicapped  persons  into  the  mainstream  of  life,  as  PL  94-142 

does  in  schools,  may  in  the  long  run  have  favorable  effects  on  attitudes. 

But  how  to  modify  stereotypic  attitudes  toward  those  who  are  disabled  is  a 

significant  educational  question  (Shaver  &  Curtis,  1981a,  b)  for  several 

reasons.    First,  a  large  number  of  people  are  affected  directly— some  35  to 

50   million  persons  with  potentially  handicapping  mental   and  physical 

disabilities  (Bowe,  1978,  p.  17;  Kleinfield,  1979,  p.  32).     Moreover,  the 


*The  terms  "disabled"  and  ''handicapped"  are  not  used  consistently  in  society. 
Sometimes  they  are  used  interchangeably,  sometimes  to  denote  different 
concepts.  The  authors  prefer  to  use  "disabled"  to  refer  to  persons  who  have 
physical  or  mental  impairments,  with  "handicapped"  used  to  refer  i:o  those 
persons  for  whom  disabilities  limit  their  ability  to  function  (Shaver  & 
Curtis,  1981a,  pp.  1-2).  The  point  relevant  to  this  report  is  that  the 
attitudes  of  other  persons  towards  one's  disability  may  make  it  a  handicap, 
as  well  as  increase  the  extent  to  which  it  is  a  handicap.  This  distinction 
in  usage  is  hard  to  maintain,  and  we  have  not  been  able  to  do  so  throughout 
the  report.  In  keeping  with  general  practice,  we  have  used  "handicapped"  as 
the  more  "generic"  term. 
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consequences  of  handicapism  are  of  grave  concern  in  a  society/  such  as  ours/ 
in  which  human  worth  and  dignity  and  the  development  of  individual  potential 
are  basic  values.  In  addition/  although  the  deleterious  effects  of 
handicapism  on  the  handicapped  are  well  known/  we  often  overlook  the  self- 
degradation  of  those  wlio  act  on  prejudice  to  deny  others  their  rights  (Robert 
Coles/  Introduction  to  Kleinfield/  1979).  Finally,  the  costs  to  oociety  are 
high/  in  terms  of  both  productivity  and  public  e.^penditures  for  social 
services/  when  handicapped  persons  are  undereducated/  underemployed/  and 
underpaid. 

The  Problem 

In  light  of  the  significant  potential  effects  of  attitudes  toward 
persons  with  disabilities/  it  is  not  surprising  that  a  considerable  amount  of 
research  has  been  done  to  invescigate  ways  to  modify  those  attitudes.  Yet/ 
as  Jones  (1984/  p.  vii)  noted  recently  in  an  introduction  to  a  review  volume/ 
"this  varied  and  rich  [research]  lite'.ature  .  .  .  has  not  yet  been 
synthesized  for  special  education  consumers  and  researchers/"  and  "state-of- 
the-art  overviews"  and  "critical  reviews"  of  the  research  on  attitudes  toward 
handicapped  pecsons  are  needed. 

It  is  not  that  there  have  b^en  no  reviews  of  the  research  on  modiT  ^ing 
attitudes  toward  persons  with  disabilities.  Prior  to  beginninc^  the  study 
reported  in  following  chapters/  we  identified  three  articles  (Anthony/  1972; 
Donaldson/  1980;  Sandler  &  Robinson,  1981)  and  a  chapter  (Towner/  1984)  that 
were  focused  on  that  literature.    In  addition/  ten  review  articles*  were 


*Alexander  and  Strain  (1978)/  Chubon  (1982)/  English  (1971)/  Frith  and 
(Mitchell  (1981)/  Harth  (1973)/  Home  (1979)/  Levitt  and  Cohen  (1976)/ 
Mitchell  (1976)/  Rabkin  (1972)/  Segal  (1978). 
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located  that  have  brief  sections  on  the  topic.  However,  all  of  these  reviews 
were  found  to  suffer  from  the  common  weaknesses  in  integrative  reviews  which 
Jackson  (1978,  1980)  cited  following  his  analysis  of  a  random  sample  of 
review  articles. 

Although  reviewing  and  synthesizing  research  findings  has  been  a  common 
activity  in  social  science  research,  Jackson  (1978,  1980)  found  an  absence  of 
well-defined  procedures  for  conducting  integrative  reviews.  Included  in  his 
list  of  "important  weaknesses  in  the  currently  prevailing  methods  of 
integrative  reviews"  (1978,  p.  37)  were:  (1)  the  lack  of  thorough, 
systematic  searches  of  the  literature,  v2)  a  tendency  not  to  represent  study 
findings  as  powerfully  as  possible,  (3)  the  failure  to  consider 
systematically  the  possible  relationships  between  study  characteristics  and 
outcomes,  and  (4)  the  lack  of  adequate  reporting  o£  review  methods.  Based  on 
his  analysis,  Jackson  (1978,  Ch.  6)  prooosed  seven  elements  to  be  considered 
in  judging  the  quality  of  reviews: 

1.  Topic  Selection — Was  the  topic  clearly  defined  and  delimited? 

2.  Review  of  Previous  Work — Were  previous  efforts  to  review  similar 
bodies  of  literature  cited  and  critiqued  so  that  (a)  it  is  clear 
how  the  present  work  will  differ  from  or  extend  previous  work,  (b) 
an  appropriate  point  of  departure  for  the  present  work  can  be 
determined/  and  (c)  the  present  work  will  avoid  the  mistakes  of  past 
reviews. 

3.  Selection  of  Studies  to  be  Reviewed — Were  the  criteria  for  selecting 
studies  tc  be  reviewed  clearly  explicated?  Was  a  representative  or 
comprehensive  sample  of  previous  research  on  the  topic  reviewed,  so 
that  results  of  the  review  are  generalizable  co  the  "population"  of 
research  studies? 

4.  Data  Collection — Were  data  collection  procedures  specifically 
described  and  defended  on  rational  and  empirical  grounds?  Were  data 
collected  for  each  study  on  the  magnitude  and  direction  of  outcomes/ 
the  characteristics  of  dependent  variables  (study  outcomes),  the 
intervention,  and  other  study  or  subject  characteristics,  such  as 
age  of  students,  implementation  of  intervention,  methodological 
quality? 
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5.  Data  Analysis  —  Were  the  relationships  between  dependent  and 
independent  variables,  including  treatment  and  concomitant 
variables/  examined?    Were  appropriate  analysis  techniques  utilized? 

6,  Interpretation — Were  conclusions  carefully  based  on  the  data/ 
including  evidence  of  confounding  variables?  Were  implications  of 
interest  to  relevant  audiences  (e.g./  practitioners/  researchers/ 
policy  makers)  drawn? 

?•  Reporting — Were  results  reported  in  such  a  way  that  the  reader  can 
tell  exac  .ly  what  procedures  and  operational  definitions  were  used? 
Could  the  investigation  be  replicated  based  on  the  information 
reported?    Were  conclusions  and  recommendations  clearly  stated? 

None  of  the  four  full  revievzs  on  modifying  attitudes  toward  the  disabled 

that  we  identified  stacked  up  well  against  Jackson's  criteria/  nor  did  the 

brief  sect^^ns  in  broader  reviews.     These  reviews/   and  others  located  after 

the  study  began/  will  be  discussed  in  considerable  detail  in  Chapter  2.  The 

purpose  here  i;   to  set  the  context  for  the  decision  to  conduct  another 

review.    For  example/  in  none  of  the  four  reviews  were  previous  reviews 

critiqued  and  a  statement  made  as  to  how  the  review  being  reported  differed 

from/  extended/  or  benefited  from  prior  reviews.     In  fact/   in  only  one 

(Towner/   1984)  was  a  prior  review  acknowledged.     Moreover/  none  of  the  four 

reviews  specified  how  its  sample  of  primary  research  reports  was  selected  for 

review.    That  some  selection  was  likely/  especially  for  tho  reviews  reported 

since  1972/  seemed  obvious.    The  slight  increases  and  one  decrease  in  numbers 

of  reports  cited  (Anthony/  1972~N=31;  Donaldson/  1980 — N=22;  Sandler  & 

Roi  inson/    1981 — N=40;   and  Towner/    1984~N=47)  did  not  reflect  the  expected 

year-by-year  accretion  in  publications.    Moreover/  considerable  selection  was 

also  suggested  by  a  comparison  of  those  numbers  against  the  nearly  200 

research  reports  which  we  had  already  identified  as  potentially  relevant  for 

review. 


r  o 
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Data  collection/  analysis/  and  reporting  procedures  were  also  identified 
as  major  shortcomings  in  the  four  reviews.  In  fact/  the  type  of  data 
collection  was  not  evident  in  any  of  the  reviews;  the  results  of  primary 
research  projects  were  described  verbally  in  terms  of  differences  (e.g./ 
"Soloway  found  that  more  favorable  attitudes  toward  disabled  children  were 
demonstrated  by  teachers  without  integration  experience  ....  Negative 
changes  in  attitude  toward  and  optimism  concerning  the  integration  of 
exceptional  learners  were  found  by  Fen ton  (1975)  and  Shotel  et  al.  (1974)" 
(Towner/  1984/  p.  236).  As  the  example  implies/  the  magnitude  of 
differences  was  not  reported/  nor  were  summary  statistics  (e.g.^  means  and 
standard  deviations)/  statistical  analyses/  or  the  probabilities  of  results 
presented.  The  basic  reporting  style  was  narrative/  with  little  effort  to 
summarize  findings  systematical''y.  The  exception  was  Towner  (1984)/  who 
pcovided  summary  tables/  with  studies  identified  as  "successful"  if 
statistically  significant  results  were  obtained.  With  the  exception  of 
Towner  (1984)/  the  methodologies  of  primary  studies  were  not  critiqued. 
However/  Towner  discussed  the  covariation  of  outcomes  with  methodological 
soundness  in  only  general  Lerms;  the  research  outcomes  were  not  expressed  in 
quantitative  terms  that  would  allow  a  systematic  analysis  and  description  of 
met-\odology~result  relationships.  The  statistical  significance  of  results 
was  referred  to  in  Towner's  review/  but  without  recognition  that  statistical 
significance  is  relative  to  sample  size  and  so  does  not  provide  a  measure  of 
research  results  that  is  comparable  across  studies  (Shaver/  1980). 

Given  the  shortcomings  of  these  reviews  of  research  on  modifying 
attitudes  toward  the  handicapped/  it  is  not  surprising  that  even  the  author 
of  the  most  recent/  thorough/  and  systematic  review  (Towner/  1984)  concluded: 
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The  applications  [of  similar  techniques]  yielded  discouiraging  and 
contradictory  findings.  Both  positive  and  .tegative  attitudinal  changes/ 
in  addition  to  numerous  reports  of  [statistically]  nonsignificant 
changes/  resulted  from  interactions  [of  nondisabled  persons]  with 
disabled  persons  as  well  as  from  the  provision  of  educational  and 
general  information,     (p.  249) 

The  lack  of  a  comprehensive/  quantitatively-based/   integrative  review  of  the 

research  on  modifying  attitudes  toward  the  disabled  was  the  problem  addressed 

by  the  research  project  presented  on  the  following  pages.    The  purpose  of  the 

project  was  to  develop  a  clearer  portrayal  of  the  status  of  research  in  this 

area/  useful  to  educational  practitioners/  policy  makers/  and  researchers. 


m  o 
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CHAPTER  2 

AiSI  ANALYSIS  Gc  PRIOk  KtiViEWS 

Primary  research  studies  are  to  be  designed  and  conducted  within  the 
context  of  a  critical  review  of  prior  research/  according  to  a  commonly 
accepted  canon  of  scholarship.  It  is  a  rare  research  report  that  does  not 
include  a  review  of  prior  research.  As  scholarly  contributions  to  knowledge/ 
reviews  of  the  literature  should  also  be  based  on  prior  efforts.  According 
to  Jackson  (1978): 

Previously  completed  reviews  of  the  topic  or  similar  topics  ought  to  be 
consulted  to  assess  what  is  already  known  on  the  topiC/  to  refine 
questions  or  hypotheses  for  the  forthcoming  review*  to  anticipate 
problems  that  may  be  encountered  v;hen  doing  the  review/  to  gain 
familiarity  with  alternative  ways  of  doing  the  review  and  to  acquire 
ideas  for  interpreting  the  results  of  the  forthcoming  review.  Seldom  is 
a  review  topic  so  unique  that  the  reviewer  cannot  benefit  from  examining 
previous  reviews,    (p.  53) 

Yet/  as  White/  Bush/  and  Casto  (1985-86)  have  pointed  out,  "reviewing  and 

reporting  the  work  of  previous  integrative  reviewers  on  similar  topics  is 

seldom  done  in  reports  of  'literature  reviews'"  (pp.  417-18). 

Of  course/  like  any  evidence/  the  results  of  previous  reviews  of  the 

literature  should  not  be  accepted  and  used  uncritically.     The  critical 

examination  o^  procedures  and  conclusions  is  essential  if  the  prior  work  is 

to  serve  as  a  valid  foundation  for  new  efforts.    Nevertheless/  Jackson  (1980, 

p.  443)  found  that  although  27  of  the  36  reviews  in  his  sample  cited  prior 

reviews/  only  two  provided  critiques  of  the  prior  work.    White/  Bush/  and 

Casto  (1985-86)  found  that  ten  of  52  reviewers  cited  at  least  three  prior 

reviews/  but  none  provided  a  critical  analysis  ol  the  prior  work.    In  this 

chapter/  we  present  a  critique  of  the  prior  reviews  of  research  on  modifying 

attitudes  toward  disabled  persons. 


ERLC 


31 

9 


The  Reviews 

Our  search  of  the  research  literature  indicated  that  during  the  past  two 
decades,  over  200  investigations  of  methods  for  enhancing  attitudes  toward 
disabled  persons  had  been  reported.  Despite  the  amount  of  research,  there 
have  been  few  published  efforts  to  summarize  and  synthesize  the  findings*  A 
computer^assisted  search  of  ERIC,  CEC  Abstracts,  Dissertation  Abstracts, 
Index  Medicus,  Psycholoc^ic;:!  Abstracts;  and  Social  Science  Research  using 
broad  descriptors,  along  with  a  manual  search  of  bibliographies  and  reference 
lists  in  over  600  research  reports  and  other  publications  pertaining  to 
attitudes  and  the  disabled,  yielded  the  titles  of  only  seven  reviews 
(Anthony,  1972;  Donaldson,  1980;  Haddle,  ]974;  Home,  1985;  Towner,  1984) 
devoted  to  the  research  literature  on  modifying  attitudes  toward  disabled 
persons  or  toward  persons  with  a  particular  type  of  disability  (mental 
retardation:  Sandler  &  Robinson,  1981;  physical  disabilities:  Westwuod, 
Vargc,  &  Vargo,  1981).  An  additional  eight  reviews  were  identified  that 
contained  brief  sections  on  the  general  topic  (Alexander  &  Strain,  1978; 
Chubon,  1982;  Horne,  1979)  or  on  modifying  attitudes  toward  persons  with  a 
specific  type  of  disability  (mental  retardation:  Harth,  1971;  mental 
illness:  Johannsen,  1969,  Rabkin,  1972,  Segal,  1978;  physical  disabilities: 
Pulton,  1976).  The  purpose  of  this  chapter,  as  noted  above,  is  to  examine 
critically  the  full  reviews  and  eight  of  the  partial  reviews  with  sufficient 
relevance*  in  terms  of  their  methodological  soundness,  their  contributions  to 
knowledge,  and  the  applicability  of  their  findings. 


*Pulton*s  (1976)  review  is  not  a  traditional  review  of  literature  in  the 
sensG  of  an  effort  to  summarize  the  state  of  research.  Rather,  the  review 
is  part  o£  an  effort  to  build  a  case  for  a  particular  approach  to  attitude 
change.  The  review  is  frequently  cited  for  its  conclusions,  however,  so  we 
have  included  it  in  our  critique.    Similarly,  Haddle's  (1974)  article  is 

(continu3d  on  next  page) 
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Judging  the  Quality  of  Reviews 
Conducting  interpretative  reviews  of  the  research  literature  is  a  common 
social  science  activity.  Well-defined  procedures  for  assimilating  the 
findings  of  a  number  of  primary  studies  have  been  formalized  only  recently/ 
however  (Jackson,  1980;  Light  &  Pillemer,  1982).  Prior  reviews  have  tended 
to  be  "unsystematic  .  .  .  narrative,  and  subjective"  ^Light  &  Pillemer,  1982/ 
p.  2).  That  traditional  narrative  review  approach  has  in  recent  years  been 
subjected  to  a  great  deal  of  criticism  (e.g..  Cook  &  Leviton/  1980;  Cooper  & 
Rosenthal,  1980;  Glass,  1976,  1977;  Hunter,  Schmidt,  &  Jackson,  1982; 
Jackson,  1978,  1980;  Rosenthal;  1984),  with  Jackson  (1978,  1980)  presenting 
what  is  undoubtedly  the  most  systematic  study  of  the  status  of  the  review 
literature. 

Jackson  (1978,  1980)  analyzed  a  random  sample  of  36  integrative  review 
articles.  He  concluded,  as  we  ncced  in  Chapter  1,  that  important  weaknesses 
were  pervasive.  They  included  lack  of  attention  to  previous  reviews, 
incomplete  literature  searches,  inadequate  summaries  of  study  findings,  and 
the  absence  of  systematic  examinations  of  relationships  between  study 


included  as  a  full  review  even  though  it  contained  two  parts,  the  second  of 
which  was  directed  toward  building  a  rationale  for  using  systematic  de.sensi- 
tization  to  modify  attitudes.  Despite  the  title  of  a  paper  by  Bernotavicz 
(1979),  the  focus  is  on  the  research  on  visual  presentations  rather  than  on 
modifying  attitudes  toward  disabled  persons.  Consequently,  it  was  not 
included  in  this  chapter,  even  though  it  would  be  an  excellent  source  for 
someone  interested  in  an  intensive  examination  of  information-media  attitude 
change  studies.  Other  review  articles  (e.g.,  English,  1971;  Frith  & 
Mitchell,  1981;  Levitt  &  Cohen,  1976)  were  excluded  because  they  did  not 
address  the  research  on  modifying  attitudes  toward  -disabled  persons.  The 
brief  review  by  Yuker  et  al.  (1970,  pp.  87-93)  was  not  included,  although 
often  cited,  because  of  its  focus  on  research  with  the  Attitudes  Toward 
Disabled  Persons  (ATDP)  scale.  Also,  Horne  (in  press)  was  not  received  in 
lime  to  be  included,  but  it  is  in  most  respects  similar  to  Horne  (1985). 


characteristics  and  outcomes.  Following  up  on  these  conclusions/  Jackson 
(1978,  Ch.  6)  proposed  seven  crucial  tasks  to  be  considered  when  planning  a 
review  or  judging  the  quality  of  an  existing  review.  These  tasks  are:  (1) 
The  selection  and  definition  wf  the  topic,  (2)  the  use  made  of  previous 
reviews,  (3)  the  selection  of  studies  to  be  included  in  the  review,  (4)  the 
collection  of  data  from  the  primary  research  reports,  (5)  the  analysis  of 
data,  (6)  the  interpretation  of  the  results,  and  (7)  reporting  the  review. 
It  is  not  a  coincidence  that  the  components  to  be  considered  in  planning  or 
critiquing  a  review  are  similar  to  those  to  be  taken  into  account  in 
designing  or  evaluating  a  piece  ot*  orimary  research.  The  reviewer  and  the 
primary  researcher  share  the  goal  of  making  accurate  generalizations  based  on 
data  which  they  collect  (Jackson,  1978,  p.  7). 

Using  the  seven  areas  identified  by  Jackson  (1980)  as  a  general 
framework,  we  developed  six  sets  of  questions  to  provide  the  context  for 
judging  the  quality  of  the  seven  reviews  and  the  brief  sections  in  the  eight 
general  review  articles  dealing  with  modifyincj  attitudes  toward  disabled 
persons. 

1.  Formulating  the  Problem 

a.  Was  the  problem  clearly  defined  and  delimited,  and  its  importance 
justified? 

b.  Were  central  terms  clearly  defined? 

c.  Were  questions  identified  that  the  reviewer  attempted  to  answer  or 
hypotheses  stated  that  the  reviewer  sought  to  test? 

d.  Were  the  questions  and  hypotheses  adequately  warranted  by  reference 
to  theory,  previous  reviews,  research,  or  soundly  based  insight? 

2.  Building  on  Prior  Reviews 

a.  Were  previous  efforts  to  review  similar  bodies  of  research  cited? 

b.  Were  prior  reviews  critiqued  as  a  basis  for  (1)  the  justification  of 
another  review  as  different  from  or  an  extension  of  prior  reviews, 
(2)  an  appropriate  point  of  departure  for  the  review,  and  (3) 
avoiding  the  inadequacies  and  errors  of  prior  reviews? 
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3.  Selecting  Studies  to  be  Reviewed 

a.  Was  the  method  of  locating  studies  (e.g.,  the  indexes/  reference 
lists/  bibliographies  used)  described? 

b.  Were  criteria  for  selecting  and  excluding  studies  to  be  included  in 
the  review  clearly  explained? 

c.  Was  a  representative  or  comprehensive  sample  of  prior  research  on  the 
problem  reviewed? 

d.  Did  the  sample  of  studies  reviewed  have  a  bearing  on  the  problem? 

e.  Was  the  sample  biased  either  by  being  too  small  or  by  the  failure  to 
include  relevant  studies? 

4.  Collecting  Data  from  the  Primary  Studies 

a.  Were  data  collected  for  each  study  on  common  deperident  and 
independent  variables? 

b.  Were  data  collection  categories  defended  on  rational  and  empirical 
grounds? 

c.  Were  data  collection  procedures  specifically  described? 

d.  Were  findings  from  studies  recorded  in  terms  of  effect  sizes? 

5.  Analyzing  the  Data 

d.  Did  the  examination  of  relationships  between  dependent  and 
independent  variables  take  into  account  concomitant  variables  that 
might  have  influenced  the  results/  including  sample  characteristics^ 
assessment  instruments/  statistical  techniques/  and  design  factors? 

b.  Did  the  reviewer  try  to  account  for  any  findings  within  the  sample  of 
studies  analyzed? 

c.  Were  serious  methodological  weaknesses  in  studies  identified? 

6.  Reporting  and  Interpreting  the  Findings 

a.  Were  the  findings/  including  the  results  of  analyses/  reported 
clearly/  for  example/  using  summary  tables  to  help  readers  comprehend 
the  pattern  of  results  from  the  primary  research  reports? 

b.  Were  the  conclusions  of  the  review  sufficiently  supported  by  t:he  data 
and  analyses? 

c.  Did  the  review  contain  implications  lor  policy  or  practice? 

d.  Did  the  reviewer  draw  conclusions  about  attitude  change  theories? 

e.  Did  the  review  contain  recommendations  for  future  research  or 
reviews? 


Review  of  the  Reviews 
On  the  following  pages,  the  six  sets  of  questic  U5  are  applied/  in  the 
order  stated  above/  to  the  seven  reviews  and  eight  review  sections.  Specific 
examples  are  included  to  illustrate  the  afplications  and/   as  is  appropriate 
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in  scholarly  work/  studies  are  cited  so  that  our  readers  can  check  on  our 
interpretations  and  analysis. 

Such  specific  referencing  of  weaknesses  in  studies  is  usually  not  done, 
perhaps  to  avoid  subjecting  the  authors  who  are  cited  to  potential 
embarrassment.  But  as  careful  attention  to  reviews  of  literature  as  a 
scholarly  activity  becomes  more  common/  critiques  with  specific  citations/ 
such  as  Slavin's  (1984)  criticism  of  prior  meta-analyses/  ought  to  become 
more  frequent.  Only  in  that  way  will  readers  be  helped  to  develop  a  sense  of 
which  reviews  can  be  reli'^u  on  and  to  what  extent/  and  of  the  considerations 
that  are  important  in  evaluating  and  preparing  reviews. 

It  should  be  noted  as  well  that/  like  the  perfect  piece  of  primary 
research/  the  perfect  (uncriticizable)  review  of  research  may  be  well  nigh 
impossible.  Our  own  integrative  review  of  research/  to  be  reported  in  che 
following  chapters/  will  not  elicit  from  its  critics  a  strongly  affirmative 
answer  to  every  question  in  the  list.  Furthermore/  some  of  the  reviews  of 
research  we  critique  on  the  following  paaes  were  written  before  the  recent 
focus  of  attention  on  the  conducting  of  reviews  became  a  matter  of  general 
scholarly  concern.  Finally/  our  intent  is  not  to  demean  the  previous 
authors/  but  to  try  to  learn  from  their  efforts. 

Formulating  the  Problem 

The  appropriate  starting  point  for  a  review  of  literature,  as  with 
primary  research/  is  a  clear  statement  of  the  perplexity  underlying  the 
scholarly  effort.  Hence/  the  first  set  of  questions  to  serve  as  a  context 
for  judging  reviews  has  to  do  with  problem  formulation. 

Rationales  for  conducting  reviews  of  the  research  literature  on 
modifying  attitudes  towrrd  disabled  persons  were  provided  in  all  but  one 
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(Haddle/  1974)  review.  For  the  most  part/  tue  reviewers  acknowledged  the 
prevalence  of  negative  attitudes  toward  the  disabled/  both  in  the  community 
at  large  (^inthony/  1972;  Harth/  1973;  Johannsen/  1969;  Pulton/  1976;  Rabkin/ 
1972;  Sandler  &  Robinson/  1981;  Segal/  1978;  Westwood  et  al./  1981)  and  in 
specific  groups/  such  as  teachers  and  school  personnel/  mental  health 
professionals/  service  providers/  and  employers  (Alexander  &  Strain/  1978; 
Chubon/  1982;  Donaldson/  1980;  Korne/  1979/  1985).  Either  directly  stated  or 
implicit  within  most  of  these  reviews  was  the  need  to  mount  attitude  change 
programs  in  order  to  provide  better  services  to  disabled  persons  (e.g./ 
Alexander  &  Strain/  1978;  Chubon/  1983;  Donaldson/  1980;  Horne/  1979; 
Westwood  et  al./  1981)  or  to  facilitate  integration  by  reducing  public 
prejudice  (e.g./  Anthony/  1972;  Harth/  1973;  Johannsen/  1969;  Sandler  & 
Robinson/   1981;  Segal/  1978). 

The  rationales  for  two  reviews/  however/  appeared  to  be  based  more  o: 
scholarly  interests  than  on  the  practical  problem  of  mitigating  the  effects 
of  handicapism.  Towner's  (1984)  orientation  was  theoretical.  Her  purpose 
was  to  examine  the  "variability  in  the  results"  of  a  number  of  ei-npirical 
studies  using  factors  hypothesized  to  be  requisites  to  effective  attitude 
change  interventions  (p.  22.?).  Rabkin's  (1972)  interest/  on  the  other  hand/ 
was  historical.  Her  brief  review  of  attitude  change  studies  was  one  aspect 
of  a  general  review  of  the  literature  that  included  descriptiops  of  changing 
patterns  in  attitudes  toward  mental  illness  and  the  treatment  of  mentally  ill 
patients/  and  of  instruments  commonly  used  to  assess  attitudes  toward  the 
mentally  ill. 

Defining   terms.      As   in   designing   primary   resoarch/    the  problem 
underlying  a  review  of  research  ought  to  be  stated  in  terms  sufficiently 
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precise  to  provide  unambiguous  guidance  in  the  collection,  analysis,  and 
j  iiterpretat ion  of  data*  Such  specificity  requires  both  conceptual  and 
operational  definitions  of  relevant  variables. 

The  construct/  "attitude"/  is  central  to  the  topics  of  the  reviews  of 
literature  critiqued  in  this  chapter,  (One  review  [Rabkin/  1972]  was 
included  even  though  "opinions"  was  used  ii  the  title/  because  "attitudes" 
was  used  interchangeably  with  "opinions"  throughout  the  article.)  Despite 
the  signal  importance  of  the  construct  of  attitude/  only  two  reviewers 
provided  a  conceptual  definition  of  it.  Harth  (1973)  described  attitudes  as 
"predispositions  toward  behavior"  (p.  150)/  a  definition  attributed  to  an 
earlier  work  by  Osgood,  Succi,  and  Tannebaum  (1957).  Johannsen  (1969)  used 
the  following  definition  of  attitude/  from  Nunnally  (1961):  a  "personal 
disposition  avoiding  truth  as  an  issue".  According  to  Johannsen/  Nunnally 
concluded  that  information  and  attitude  were  distinctly  different  constructs 
and  while  information  refers  to  facts  that  can  be  either  proven  or  disproven, 
attitudes  may  or  may  not  directly  reflect  agreed  upon  facts  (p.  219). 

Although  Towner  (1984)  acknowledged  the  multi-dimensional  nature  of 
attitudes  and  reported  which  of  the  studies  she  reviewed  contained 
definitions  of  attitude,  she  did  not  present  her  definition  of  the  construct 
in  her  report. 

An  operational  definition  of  attitude  was  not  specifically  given  in  any 
of  the  15  reviews.  Implicit,  however,  was  the  presumption  that  attitudes 
were  represented  by  scores  on  whatever  attitude  measures  were  used  in  the 
primary  research. 

Conceptual  definitions  of  the  constructs  used  to  identify  disability 
groups  were  also  missing  in  most  of  the  reviews  and  had  to       inferred  from 
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the  context  of  the  review.  The  disability  constructs  were  not  operationally 
defined,  either^  Terms  used  frequently  in  titles  to  describe  the  population 
of  concern/  such  as  "disabled"/  "handicapped"/  "mentally  ill",  and  "mentally 
retarded"/  lack  precision;  interpretations  of  the  populations  to  which  they 
refer  may  vary  widely.  Johannsen's  (1969)  discussion  of  the  range  of 
definitions  accorded  to  "mental  patient"  and  "mental  illness"  illustrates 
that  point/  as  does  Rabkin's  (1972)  reference  to  the  changing  definition  of 
mental  illness  as  psychiatry  has  moved  from  a  medical  model  toward  one  of 
"psychological  conceptions  of  problems  of  being"  (p.  155). 

Questions  and  hypotheses.  As  with  the  planning  of  a  primary  study/ 
questions  and  hypotheses/  usually  formulated  from  theory  and  prior  research/ 
should  provide  the  focus  for  an  integrative  review  (Jackson/  1980). 
Questions  or  hypotheses  v  ^re  rarely  a  feature  of  the  reviews  and  sections  of 
reviews  under  examination*  In  fact/  hypotheses  were  not  directly  stated  in 
any  review/  and  specific  questions  to  be  investigated  were  identified  only  by 
Chubon  (1982)  and  Donaldson  (1980).  It  seemed  likely  that  the  questions  in 
these  reviews  had  been  developed  from  perusals  of  the  research  literature. 
Questions  in  both  reviews  were  stated  in  a  very  general  manner/  without 
reference  to  specific  variables  or  attitude  change  theories. 

Comments.  Conventional  standards  for  problem  statements  as  a  bajie  tor 
research  studies  were  generally  not  followed  by  the  authors  of  reviews  of  the 
research  literature  on  changing  attitudes  toward  the  disabled.  Reviews  in 
this  area  have  frequently  been  weakened  by  the  failure  to  provide  adequate 
definitions  for  relevant  variables  and  by  not  centering  the  reviews  on 
significant  questions  and  hypotheses  developed  from  attitude  theory  and 
research. 
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Building  on  Prior  Reviews 

The  importance  of  reviewing  prior  reviews  of  literature  was  emphasized 
in  the  paragraphs  introducing  this  chapter.  We  noted  that  Jackson  (1980)  and 
White,  Bush,  and  Casto  (1985-86)  had  not  found  the  critical  analysis  of  prior 
work  to  be  common  in  their  samples  of  review  articles.  The  present 
investigation  of  reviews  yielded  similar  findings.  Of  the  15  reviews  and 
brief  reviews  examined,  only  seven  cited  prior  reviews;  however,  in  only  four 
was  it  made  clear  that  reviews  of  the  literature  were  being  cited  (Horne, 
1979;  Sandler  &  Robinson,  1981;  Towner,  1984;  Westwood  et  al  ,  1981). 

Lists  of  the  prior  reviews  available  at  the  time  each  review  was  written 
are  contained  in  Table  i.  The  most-cited  reviews  were  by  Anthony  (1972)  and 
Donaldson  (1980).  References  to  Anthony's  (1972)  review  were  found  in  Horne 
(1979),  Haddle  (1974),  and  Horne  (1985).  The  last  two  works  did  not, 
however,  identify  Anthony's  article  as  a  review  of  the  literature. 
Donaldson's  review  was  referred  to  by  Sandler  and  Robinson  (1981),  Towner 
(1984),  and  Westwood  et  al.  (1981). 

It  appears  that  the  cited  reviews  were  accepted  without  question. 
Neither  the  Donaldson  or  Anthony  review  was  subjected  to  critical 
examination,  although  methodological  weaknesses  in  each  will  be  noted  later 
in  this  chapter.  Donaldson's  review,  in  particular,  was  very  favorably 
described  in  Towner  (1984)  and  Westwood  et  al.  (1981). 

Justification  of  a  new  review.  With  the  exception  of  Towner  (1984^»  the 
authors  who  cited  prior  reviews  provided  no  rationale  for  conducting  another 
review;  nor  did  they  state  how  their  review  differed  from  earlier  works. 
Towner  did  acknowledge  Donaldson's  (1980)  review,  and  she  explained  th^t  her 
review  differed  from  Donaldson's  in  both  "format"  and  "focus"  (p.  223). 
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Furthermore/  she  suggested  that/  together/  the  two  reviews  offered  a 
"comprehensive  analysis  of  the  literature  on  modifying  atu'tudes  toward 
disabled  persons"  (p.  223), 

A  comparison  of  the  two  reviews  revealed  that  Towner's  review  did  indeed 
differ.  Although  the  basic  reporting  style  of  both  reviews  was  narrative  and 
both  contained  summary  tables/  they  varied  in  significant  ways,  Towner's 
review  was  more  thorough  and  methodological.  For  example/  Towner's 
discussion  of  study  methodologies  was  considerably  more  extensive  than 
Donaldson's  brief  one~sentence  allusion  to  some  studies  whose  weak  designs 
seriously  threatened  the  generaii^ability  of  their  findings  (p.  505), 
Moreover/  Towner's  application  of  attitude  change  theories  in  her  analyses  of 
studies  was  much  more  systematic  than  Donaldson's  attempt  to  explain  the 
theoretical  bases  for  certain  attitude  change  strategies.  And/  Towner 
examined  more  than  twice  the  number  of  studies  reported  in  Donaldson's 
review.  The  justification  she  presented  for  doing  another  review  was 
evidenced  in  the  review  she  wrote. 

Comments,  Using  prior  reviews  as  a  basis  for  subsequent  reviews  has  not 
been  a  general  feature  of  the  literature  on  changing  attitudes  toward  the 
disabled.  Less  than  half  (47%)  of  the  reviews  examined  in  this  study 
referred  to  earlier  reviews/  and  in  none  of  the  reviews  were  previous  reviews 
critically  examined.  In  only  one  of  the  most  recent  and  systematic  reviews 
(Towner/  1984)  was  uhere  an  attempt  to  build  on  prior  works;  however/  only 
one  of  the  13  available  reviews  and  brief  reviews  was  referenced. 

Selecting  Studies  to  be  Reviewed 

Sample  selection  is  an  important/  even  if  often  ignored/  factor  in 
primary  research;  similarly  the  methods  used  to  locate  primary  research 
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reports  are  cin  important  factor  in  a  review  of  research.  Procedures  should 
be  used  that  will  locate  the  maxirriurn  number  of  prioiary  studies  (Jackson/ 
1980),  A  search  employing  computerized  information  retrieval  systems/  but 
excluding  printed  bibliographies  and  reference  lists  in  primary  reseavch 
reports  isr  for  example/  likely  to  identify  a  set  of  studies  that  is  not 
representative  of  the  entire  body  of  research.  The  findings  of  a  review 
based  on  such  a  body  of  research  could  be  severely  limited/  even  misleading. 
Since  the  adequacy  of  the  literature  search  affects  the  generalizability  of 
the  conclusions  in  a  review/  Jackson  (1980)  argued  that  it  is  the 
responsibility  of  a  re>;iewer  to  report  the  search  strategy. 

As  with  primary  research/  sampling  bias  is  a  serious  threat  to  the 
external  validity  of  reviews  of  research.  Glass  (1976)  has  argued  that  all 
studies  that  can  be  loccite^  cn  a  topic — i.e./  the  accessible  population  of 
primary  research  reports — should  be  included  in  an  integrative  review.  If/ 
however/  the  accessible  population  of  primary  reports  is  not  reviewed/  the 
sampling  procedure  should  also  be  reported  along  with  the  search  procedures 
(Jackson/  1980?  White  at  al,/  1986)/  so  that  the  reader  of  the  review  can 
judge  whether  the  sample  represents  the  population  of  studies  to  which 
generalizations  are  made. 

The  importance  of  addressing  the  method  of  sample  selection  can  be 
illustrated  uy  comparing  the  number  of  primary  research  reports  identified 
for  the  review  of  literature  reported  in  later  chapters  with  those  identified 
in  prior  reviews.  Our  literature  search  yielded  273  studies  that  met 
specific  criteria  (as  described  in  the  Chapter  3)  for  inclusion  in  our  review 
of  research  on  the  modification  of  attitudes  toward  the  disabled.  An 


additional  394  studies  were  discarded  as  net  suitable*  for  the  present 
review.  By  Uhe  end  of  che  period  from  1950  (the  earliest  study  located)  to 
July/  1986r  the  body  of  attitude  change  studies  inc^uded  at  least  706  titles 
(including  reports  cited  in  prior  reviews/  but  not  included  on  any  of  the 
lists  of  reviewed  or  discarded  rrports  for  this  review)/  of  which  239  were 
theses  or  dissertations. 

The  sum  total  of  individual  studies  concerned  with  attitude  change  cited 
in  the  seven  reviews  and  eight  brief  reviews  was  192**/  slightly  less  than 
one-quarter  of  the  studies  available.  The  median  number  of  primary  studies 
referenced  in  the  full  reviews  was  32  (X  =  35);  in  the  brief  reviews/  the 
median  was  10  (X  =  12)***.  A  correlation  coefficient  (Pearson  product -moment 
£)  of  .64  indicated  a  moderate  relationship  between  number  of  primary  studies 
cited  in  each  full  review  and  the  number  of  studies  available  when  the 
reviews  were  in  preparation*  A  similar  relationship  (£  =  .18)  was  not  found 
for  the  eight  brief  reviews. 


*0f  these/  363  were  deemed  irrelevant  because  they  were/  for  example/ 
correlational  studies/  *ised  instruments  judged  not  to  fit  our  definition 
of  attitude/  or  attitudes  toward  mainstreaming  rather  than  toward  disabled 
persons  were  assessed.  An  additional  31  studies  were  discarded  due  to 
lack  of  information.     (See  Appendix  D.) 

**Sixty-two  per  cent  (N=119)  of  these  studies  investigated  the  effects  of  a 
variety  of  interventions  (e.g./  information  about  the  disabled/  personal 
contact  with  disabled  persons/  simulations  of  disabilities)  on  attitudes 
toward  the  disabled.  Of  the  remaining  citations/  24%  (N=46)  examined 
relationships  between  attitudes  toward  the  disabled  and  variables  such  as 
amount  of  reported  contact  with  disabled  persons /  knowledge  of 
disabilities/  and  membership  in  mainstreamed  classes/  and  13%  (N=24) 
assessed  the  efficacy  ^of  courses  in  psychology/  special  education/  nursing 
and  so  forth  for  enhancing  university  students'  attitudes  toward  the 
disabled.  Two  percent  (N=3)  of  the  studies  compared  the  attitudes  of 
different  professional/  student/  and  community  groups. 

***Median  and  mean  numbers  of  studies  referenced  are  rounded  to  whole 
numbers/  as  references  are  cited  in  toto. 
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The  relationship  between  reports  available  and  reports  cited  is  depicted 
in  Figure  1.  The  cumulative  numbers  of  dissertations  and  other  reports  of 
primary  studies  available  from  1951  through  1985  are  indicated  by  the  shaded 
areas/  and  the  top  line  on  the  graph  indicates  the  total  cumulative  number  of 
reports  available.  The  bars  on  the  graph  show  the  number  of  reports  cited  in 
the  individual  reviews  which  are  identified  by  letter  in  the  list  below  the 
graph. 

In  interpreting  the  information  in  Figure  1/  as  well  as  in  contemplating 
our  discussion  of  the  absence  of  a  common  core  of  cited  studies  in  the  nexu 
section/  it  is  important  to  remember  that  although  the  majority  of  the 
reviewers  were  concerned  with  disabilities  generally/  two  of  the  full  reviews 
were  directed  at  attitudes  toward  persons  with  specific  disabilities  (mental 
retardation:  Sandler  &  Robinson/  1981;  physical  disabilities:  Westwood  et 
al./  1981)/  while  five  of  the  brief  reviews  were  also  limited  in  scope 
(mental  retardation:  Harth/  1973;  mental  illness:  Johannsen/  1969;  Rabkin/ 
1972;  Segal/  1978;  physical  disabilities:  Pulton/  1976).  In  addition/  there 
was  some  focusing  on  groups  whose  attitudes  were  of  concern  (e.g./  school 
psychologists:  Home,  1979;  professionals:  Chubon,  1976).  Consequently,  it 
would  not  be  correct  to  assume  that  all  of  the  available  reports  of  research 
would  be  relevant  to  the  defined  topic  of  every  review. 

A  common  core?  Seventy  of  the  192  studies  cited  in  the  reviews  were 
included  in  two  or  more  reviews.  However/  the  seven  most  frequently 
referenced  studies  (Cleland  &  Champers/  1959;  Clore  &  Jeffrey/  1972; 
Granofsky/  1956;  Hicks  &  Spaner/  1962;  Lewis  &  Cleveland/  1966;  Warren/ 
Turner/  &  Brody/  1964;  Wilson  &  Alcorn/  1969)  were  cited  in  only  five  of  the 
15  reviews.    Another  14  studies  (Altrocchi  &  Eisdorfer/  1961;  Anthony/  1969; 
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Reviews  and  Number  of  Reports  Cited 


a-  Johannsen  (1969)/  12 

b.  Rabkin  (1972),  25 

c.  Anthony  (1972),  30 

d.  Harth  (1973),  10 

e.  Haddle  (1974),  31 

f.  Pulton  (1976),  7 

g.  Segal  (1973),  5 

h.  Alexander  &  Strain  (1978),  5 

i.  Home  (1979),  15 

j.  Donaldson  (1930),  24 

k.  Westwood  et  al.  (1981),  29 

1.  Sandler  &  Robinson  (1981),  36 

m.  Chubon  (1932),  27 

n.  Towner  (1984) ,  4^^ 

o.  Home  (1985)  ,  70 
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Brooks  &  Bransford/  1972;  Donaldson  &  Martinson/  1977;  Evans,  1976;  Glass  & 
Meckler,  1972;  Holzberg  Sc  GewirtZ;  1963;  Lazac;  Gensley,  &  Orpet/  1971; 
Rapier,  Addson,  Carey/  &  Croke/  19^2;  Rusalem/  1967;  Sadlick  &  Penta/  1975; 
Shotel/  lano,  &  McGettigan/  1972;  Strauch,  1970;  Yerxa,  1971)  were  referenced 
four  times.  The  remaining  49  studies  were  cited  either  three  times  (17 
studies)  or  twice  (32  studies). 

The  few  times  that  even  the  most  frequently  cited  studies  were  included 
across  reviews  suggests  that  even  the  reviewers  concerned  with  disabilities 
in  general  did  not  draw  on  a  common  core  of  research  for  their  conclusions. 
That  observa*:ion  is  confirmee  by  the  data  in  Table  2,  a  matrix  of  common 
sources  which  was  prepared  to  illustrate  the  extent  to  which  reviewers  cited 
studies  in  common.  The  greatest  number  of  common  sources  was  shared  by 
Anthony  (1972)  and  Haddle  (1974).  Of  the  30  studies  referenced  in  Haddle/  18 
had  been  cited  previously  in  Anthony's  review.  Other  pairs  of  reviews  that 
cited  a  number  of  studies  in  common  were:  Horne  (1985)  and  Towner  (1984)/ 
with  17  shared  citations;  Donaldson  (1980)  and  Towner  (1984)/  with  14  shared 
citations;  and/  Towner  (1984)  and  Westwood  et  al.  (1981)/  with  14  citations 
in  common.  For  the  most  part/  the  overlap  in  reference  lists  was  not  great. 
Two  of  the  15  reviews  (Alexander  &  Strain/  1978;  Segal/  1978)  contained  few 
citations  in  common  with  other  reviews. 

The  data  from  the  matrix  support  the  conclusion  that  there  vas  not  a 
core  body  of  research  common  to  the  reviews;  few  studies  appeared  to  have  had 
acquired  the  status/  or  visibility/  where  their  inclusion  was  requisite  to  an 
adequate  review  of  literature  pertaining  to  modifying  attitudes  toward  the 
disabled.  The  data  also  suggest  that  later  reviewers  did  not  rely  on 
reference  lists  in  prior  reviews  to  obtain  studies  to  examine. 
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Dissprtatxons,  Dissertations  and  theses/  which  comprised  over  one-third 
of  the  studies  located  for  our  review/  were  almost  totally  ignored  in  the 
prior  reviews.  The  number  (N=27)  cited  was  slightly  more  than  one-tenth  of 
the  number  available.  The  median  and  mean  numbers  of  dissertations  and 
theses  referenced  in  the  full  and  brief  reviews  were/  rounded  to  whole 
numbers/  6  (X  =  5)  and  3  (X  =  2),  respectively.  Moreover/  the  majority  (84%) 
of  dissertation  citations  were  references  to  abstracts  in  Dissertation 
Abstracts  and  Dissertation  Abstracts  International/  rather  than  to  the  actual 
dissertations.  One  of  the  two  references  to  Wyrick's  (1968)  magistral  thesis 
was/  in  fact/  a  reference  to  the  work  in  a  secondary  source. 

Search  techniques.  Given  the  lack  of  overlap  among  reference  lirtS/  it 
is  probable  that  methods  for  locating  studies  varied  among  the  reviews.  The 
search  method  was  reported  in  only  one  review  (Chubon/  1982).  Chubon 
described  his  search  as  including  a  manual  search  of  the  journals  published 
for  the  "helping  professions"  (p.  25)  and  computer-assistr  :  searches  of  ERIC/ 
Dissertation  Abstracts/  and  Psychological  Abstracts  for  th^  period  1960-1979. 
Of  the  102  articles  he  located/  62  were  discarded  because  they  did  not 
describe  empirical  studies.  Twenty-eight  of  the  remaining  articles  were 
cited  in  the  brief  section  on  modifying  attitudes  toward  the  disabled  in  his 
general  review  of  literature.  For  the  period  Chubon  reviewed/  the  population 
of  studies  concerning  attitude  modification  in  this  area  numbered  at  least 
105  dissertations/  5  theses/  and  275  articles/  unpublished  papers/  and 
project  reports. 

It  was  not  possible  to  identify  the  search  strategies  for  the  remaining 
14  reviews.  Neither  could  the  reviewers*  criteria  for  including  or  excluding 
studies  be  ascertained.    As  a  consequence/  it  was  difficult  to  judge  the 
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representativeness  of  the  sample  of  studies  cited  in  most  reviews,  and  the 
possibility  of  sampling  bias  as  a  threat  to  the  generalizabili ty  of  the 
conclusions  in  each  of  these  reviews  has  to  be  seriously  considered.  In  two 
reviews  (Alexander  &  Strain/  1978;  Home,  1979),  however,  sampling  bias  was 
beyond  doubt  as  references  were  limited  to  reports  of  successful 
interventions  only.  Bias  was  also  highly  likely  when  only  one  study  out  of 
the  many  available  was  cited  in  a  discussion  of  an  intervention  (Pulton, 
1976).  Donaldson  (1980)  also  provided  an  example  of  a  limited  sample  of 
primary  studies,  using  only  two  studies  of  simulation,  one  positive  and 
negative/  to  draw  a  conclusion. 

Study  relevance.  A  second  concern  with  sampling  is  the  appropriateness 
of  the  research  studies  for  the  specific  questions  the  review  is  intended  to 
address.  Reviews  which  attempted  to  identify  effective  strategies  for 
modifying  the  attitudes  of  particular  groups,  such  as  health  professionals 
(e.g.,  Chubon,  1982),  educators  (e.g./  Alexander  &  Strain,  1978),  or  peers 
(e.g./  Home*  1985)/  tended  to  cite  studies  that  were  relevant  to  these 
groups.  On  the  other  hand/  reviewers  who  sought  strategies  for  changing 
societal  attitudes  (e.g.,  Anthony,  1972;  Donaldson,  1980)  had  to  extrapolate 
to  the  general  population  from  the  findings  of  studies  conducted  with 
specific  populations,  such  as  university  psychology  students/  nursing 
students  in  internship  programs,  and  special  education  students  in  practica. 
Only  occasionally  did  reviewers  remind  readers  of  the  importance  of  viewing 
the  conclusions  from  a  limited  set  of  findings  with  caution  (e.g./  Anthony/ 
1972). 

Another  difficulty  was  presented  when  reviewers  drew  inferences  for 
specific  contexts  from  studies  not  directly  relevant.     Sandler  and  Robinson 
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(1981)/  for  example/  reviewed  the  literature  in  an  attempt  to  locate  "factors 
which  may  ba  related  to  improved  public  attitude  toward  development  of  group 
homes  for  mentally  retarded  people  in  the  community"  (p.  98).  Only  two  of 
the  33  studies  they  examip'cx3  had  a  direct  bearing  on  this  problem.  The  other 
31  reports  included/  among  other  things/  interventions  conducted  in  schools 
and  universities/  the  effects  of  mainstreaming  on  the  attitudes  of 
nondisabled  peers,  and  correlations  between  the  amount  of  reported  contact 
with  disabled  persons  or  the  amount  of  knowledge  about  disabilities  and 
attitudes  toward  the  disabled. 

Comments.  The  lack  of  information  concerning  literature  search 
strategies  and  sampling  procedures  cast  doubt  on  the  representativeness  of 
the  studies  cited  in  most  reviews  and  severely  limited  the  generalizability 
of  the  conclusions  of  the  reviews.  Failure  to  include  this  information  was  a 
particular  problem  because  of  the  relatively  small  number  of  studies  cited  in 
^ach  review.  Doctoral  dissertations  were  a  neglected  source  of  primary 
studies.  When  dissertations  were  referred  to  in  reviews/  frequently  only  the 
abstracts  in  Dissertation  Abstracts  were  cited.  A  search  of  the  reference 
lists  in  the  reviews  failed  to  reveal  any  studies  that  were  commonly  cited 
once  they  were  available. 

Collecting  Data  from  the  Studies 

Along  with  identi tying  primary  studies  for  an  integrat i ve  review/ 
procedures  for  collecting  data  from  the  studies  must  be  established.  The 
procedures  for  collecting  data  from  primary  studies  for  a  review  should  meet 
standards  similar  to  those  for  data  collection  in  primary  research.  That  is, 
they  should  be  "systematic"/  "well-planned"  (Borg  &  Gall/  1983/  p.  840)/  and 
"organized  in  a  manner  which  facilitates  analysis"  (Gay/  1976/   p.  218). 
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Furthermore/  since  the  methods  for  collecting  the  data  influence  the  outcomes 
of  analysis  and  the  credibility  of  the  interpretations  in  the  review 
(Jackson/  1980)/  they  should  be  described  carefully. 

None  Oi  the  15  reviews  contained  a  description  of  data  collection 
procedures*  How  data  were  collected  could  not  be  inferred  in  any  of  the 
reviews/  since/  for  the  most  part/  the  results  of  primary  studies  were 
reported  in  narrative  form.  For  example/  there  was  no  way  of  knowing  if  some 
sort  of  coding  sheet  was  used  or  if  notes  were  taicen  while  reading  studies. 
Failure  to  include  information  on  data  collection/  or  to  present  results  in  a 
manner  that  makes  the  method  evident/  appears  to  be  prevalent  in  reviews  of 
research  (Jackson/   1980;  White  et  al./  1986). 

Although  data  collection  procedures  were  not  described/  some  inferences 
can  be  drawn  about  adequacy  of  collection.  For  example/  the  reporting  of 
primary  studies  in  most  reviews  was  organized  under  broad  categories 
according  to  intervention  techniques/  with  "contact"  and  "information"  the 
most  frequently  used  headings.  There  was  a  tendency  to  describe  intervention 
strategies  only  in  a  very  general  manner.  Important  study  characteristics — 
such  as  the  treatment  setting/  who  conducted  the  treatment/  and  treatment 
length — were  ignored/  suggesting  that  data  were  not  collected  on  these 
variables^  For  example/  treatment  length/  which  may  be  a  significant  factor 
in  the  effectiveness  of  an  interv^jntion/  wa?  mentioned  for  some  studies  in 
only  five  reviews  (ponaldson/  1^80;  Haddle/  1974;  Home,  1985;  Rabkin/  1972; 
Towner/   1984)  and  for  most  studies  in  only  one  (Anthony/   1972)  review. 

Describing  interventions.  The  reviews  would  have  been  strengthened  by 
adequate  description  of  the  intervention  techniques.  For  instance/  in 
discussing  "enforced  contact"  as  a  strategy  for  modifying  attitudes  toward 
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the  disabled,  Sandler  and  Robinson  (1981)  stated,  .  .  Aloia/  Beaver,  and 
Pettus  (1978),  Leyser  and  Gottlieb  (1980),  and  Marlowe  (1979)  have  also 
reported  the  use  of  carefully  planned  interventi-^^s  which  improved  the  social 
status  of  integrated  EMR  children"  (p.  99).  In  each  of  thpse  studies, 
however,  there  were  important  differences  not  only  in  sample  characteristics, 
but  in  treatments.  These  differences  were  not  identified,  nor  was  the  reader 
informed  whether  the  differing  interventions  were  equally  effective. 

Home's  (1979)  review  provides  a  similar  example.  Contending  that  the 
research  has  indicated  that  promoting  peer  acceptance  will  lead  to  positive 
attitude  change,  Korne  stated  that  "positive  peer  interaction  .  .  .  may  be 
facilitated  .  .  .",  and  cited  Kirby  and  Toler  (1970)  and  Strain  and  Timm 
(1974)  to  support  that  point.  However,  distinctly  different  strategies  were 
employed  in  each  study.  It  was  not  indicated  in  the  review  thrt.  the  former 
article  reported  a  study  in  which  a  5-year-old  boy's  interaction  with  peers 
increased  when  he  passed  out  candy,  while  the  latter  article  described  a 
program  which  utilized  praise  and  contact  to  increase  appropriate  social 
behaviors  in  a  "disordered  pre-school  child".  Again/  the  question  is  raised 
whether  such  information  was  collected. 

Another  serious  problem  occurs  when  the  primary  studies  are  referred  to 
in  such  a  way  that  the  reader  is  not  even  able  to  infer  the  general 
categories  of  intervention  techniques.  An  example  comes  from  Chubon  (1982). 
Following  a  statement  abc-ut  the  mixed  results  with  treatments  designed  to 
"enhance  attitudes  of  teachers  and  studerts  majoring  in  various  areas  of 
education  toward  disabled  students",  Chubon  stated,  "while  some  attitude 
change  programs  seemed  to  produce  the  desired  results  .  .  .  others  have 
produced  no  changes"  (p.  26).    He  cited  Kuhn  (1971),  Parish,  Eads,  Reece,  and 
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Piscitello  (1977),  Wilson  and  Alcorn  (1969),  and  Zukerman  (1975)  as  relevant 
studies.  Without  referring  directly  to  each  study,  the  reader  is  unaware 
that  the  intervention  techniques  included  exposure  to  blind  cnildren  (Kuhn, 
1971),  an  introduct'^ry  -special  education  course  (Parish  et  al,,  1977), 
simulations  of  disabilities  (VJilson  &  Alcorn,  1969),  and  gaming  (Zukerman, 
1975),  The  implication  is  that  data  were  not  collected  on  important 
treatment  attributes  such  as  these, 

A  similar  example  is  taken  from  Horne  (1985),  Commenting  on  the 
efficacy  of  programs  to  enhance  teachers'  attitudes  toward  disabled  students, 
Horne  wrote,  "Sometimes  changes  have  occurred  in  negative  directions  .  •  •  ^" 
(p.  158),  References  for  this  statement  were  Bradfield,  Brown,  Kaplin, 
Richert,  ana  Stannard  (1973),  Warren,  Turner,  and  Brady  (1964),  and  Sellin 
and  MuJchahay  (1965),  Again,  without  referring  directly  to  these  studies, 
the  reader  does  not  know  that  the  treatments  consisted  of  a  single  trip  by 
high  school  seniors  to  a  state  institution  for  the  mentally  retarded  (Sellin 
&  Mulchahay,  1965),  tours  of  institutions  for  the  sight  Juipaired,  hearing 
impaired,  and  the  mentally  retarded  conducted  within  the  context  of  an 
integrated  "psychology-education-sociology"  program  for  soptiomore  college 
students  (Warren,  Turner,  &  Brody,  1964),  and  an  ins^rvice  training  program 
designed  to  instruct  teachers  of  xntegrated  classrooms  in  individualized 
instructional  techniques  (Bradfield,  Brown,  Kaplin,  Richert;  &  Stannard, 
1973),  Again,  it  is  open  to  question  whether  data  were  collected  on  these 
central  treatment  characteristics. 

Program  de  ^cription-research  confusion.  Some  authors  cited  narrative 
descriptions  of  programs  along  with  reports  of  primary  research,  which  raises 
questions  about  their  data  collection  procedures,    Johannsen  (1959),  for 
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example/  in  a  discussion  of  employers'  attitudes  toward  hiring  ex-mental 
patients/  referred/  respectively/  to  programs  described  by  Murray  (1958)  and 
Brennan  and  Margolin  (1954)  with  the  phrases  "promising  results  were 
forthcoming"  and  "success  was  also  reported".  These  references  were  included 
with  references  to  p.vimary  research  studies  and  the  title  of  Johannsen's 
article  described  it  as  a  "review  of  empirical  research"  (p.  218). 
Consequently/  a  reader  might  assume  that  both  the  Murray  and  Brennan  ar3 
Margolin  articles  reported  the  findings  of  empirical  studies.  However/ 
neither  was  a  report  of  research?  both  were  narrative  accounts  of  programs 
advocated  by  the  authors  for  encouraging  the  acceptance  of  ex-mental  patients 
in  the  workplace. 

Conclusions-findings  confusion.  In  a  somewhat  similar  vein/  questions 
can  be  raised  about  data  collection  when  reviewers  treat  recommendations  from 
the  summary  and  conclusions  sections  of  reports  as  though  they  were  research 
findings*  Rabkin  (1972)/  for  instance/  examined  the  research  pertaining  to 
modifying  mental  hospital  attendants'  attitudes  toward  patients.  She  stated 
that  "Middleton's  (1953)  work  suggests  that  training  [for  attendants]  ought 
to  include  a  thorough  liiJ ^ctrination  about  etiology/  treatment  results/ 
examples  of  success/  and  reasons  for  failure/  with  periodic  repetitions  of 
this  training"  (p.  163).  In  his  study/  Middleton  compared  the  attitudes  of 
attendants  and  non-attendants  in  a  mental  hospital.  In  the  conclusions 
section  of  his  article/  he  recommended  the  type  of  instruction  referred  to  by 
Rabkin.  However/  such  instruction  was  not  provided  to  the  subjects  in  his 
study. 

A  similar  example  comes  from  Johannsen's  (1969)  brief  review  of  the 
research  on  changing  attitudes  toward  mental  patients.    He  suggested  that  the 
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effectiveness  of  personal  contact  as  an  attitude  change  strategy  may  depend 
upon  others*  perceptions  of  mental  patients'  behavior  as  being  normal.  He 
concluded  with  the  statement/  "Halfway  houses  .  .  .  serve  a  public  education 
function  by  enabling  relatives  to  see  the  patient  move  gradually  into  the 
community"  (p.  224),  and  cited  Pettit  (1956)  as  the  source.  Pettit's  study 
investigated  the  attitudes  of  relatives  toward  deinstitutionalizing  "long- 
hospitalized"  mental  patients.  In  the  closing  section  of  his  article,  Pettit 
suggested  that  halfway  houses  might  ease  the  movement  of  menual  patients  into 
the  community.  This  was  an  opinion,  and  the  adequacy  of  halfway  houses  for 
modifying  the  attitudes  of  relatives  was  not  examined. 

Other  problems.  Clearly,  collecting  data  in  such  a  way  that 
researcher's  conclusions  and  recommendations  are  kept  distinct  from  findings 
would  enhance  the  validity  of  reviews.  Other  problems  that  may  stem  from 
inadequate  data  collection  are  presenting  studies  incorrectly,  failing  to 
consider  all  of  the  results  from  a  piece  of  research,  or  citing  irrelevant 
studies.  Instances  of  such  practices  were  identified  in  11  of  the  15 
reviews.  The  following  examples  were  selected  from  Sandler  and  Robinson 
(1981),   Alexander  and  Strain  (1978),   Townot  (1984),   and  Home  (1979). 

Reporting  studies  incorrectly 

In  uheir  discussion  of  information  as  a  factor  in  changing  attitudes 
toward  the  disabled,  Sandler  and  Robinson  (1981)  indicated  that  Begab  (1969) 
.  .  attempted  to  improve  the  attitude  of  social  work  students  by  providing 
them  v;ith  coursev;ork  on  mental  retardation  and  found  thdt  knowledge  alone  had 
little  effect  upon  attitude.  When  coupled  with  direct  contact  with  mentally 
retarded  persons  through  a  fieldwork  experience,  however,  there  was  positive 
increase   in  attitude"  (p.    99).      A  perusal   of   the  abstract   of  Begab's 
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dissertation  (cited  by  Sandler  and  Robinson)  revealed  that  he  did  not  attempt 
tc  change  students'  attitudes;  he  conducted  a  survey  of  students  enrolled  in 
several  schools  of  social  work  to  investigate  the  "impact  of  differences  in 
curricula  and  experiences  on  social  work  students'  attitudes  and  knowledge 
about  mental  retardation"  (Begab/   1969/  p.  4111~A). 

Alexander  and  Strain  (1978)  included  Lane's  (1976)  dissertation  abstract 
in  their  review  of  educators'  at.citudes  toward  disabled  children  and 
mainstreaiTiing.  They  stated  that  his  study  "showed  that  a  background  in 
special  education  can  help  alleviate  stereotypes  or  prejudice  toward 
exceptional  children".  However/  the  purpose  of  Lane's  study  was  to 
"investigate  the  effects  of  labels  conveying  ethnic  group  membership  and 
retardation  on  evaluative  statements  made  by  prospective  teachers"  (Lane/ 
1976/  p.  1491-A).  The  subjects  were  dual  majors  in  elementary  and  special 
education.  No  comparison  was  made  with  a  control  group  of  subjects  who  did 
not  have  a  special  education  major.  The  following  hypothesis  was  not 
supported:  "Knowledge  of  a  child's  ethnic  group  label  or  label  indicating 
the  presence  or  absence  of  mental  retardation  would  result  in  more  negative 
ratings  for  those  labeled  retarded  than  not  so  labeled"  (Lane/  1976/  p.  1491- 
A).  Lane  concluded  that  "Neither  the  label  nor  the  ethnic  identification 
were  [sic]  found  to  significantly  affect  ratings  and  no  interaction  effects 
were  demonstrated"  (p.  1491-A).  The  relationship  between  a  background  in 
special  education  and  attitude  toward  exceptional  children  was  not  a  concern 
in  the  study. 

Partial  results 

When  only  partial  results  of  a  primary  study  were  reported/  suggesting 
partial  collection  of  data/  misleading  conclusions  were  sometimes  drawn  about 
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the  effectiveness  of  an  intervention,  or  factors  that  had  a  bearing  on  the 
outcome  of  tHe  study  were  not  described-  For  instance,  Towner  (1984) 
reported  that  simulation  exercises  were  successful  in  modifying  the  attitudes 
of  elementary  children  in  a  study  by  Dahl,  Horsman,  and  Arkell  (1978). 
Towner's  conclusion/  however,  was  based  on  a  statistically  significant 
ratio  for  the  difference  between  the  adjusted  posttest  means  of  thet 
experimental  and  control  students  on  only  one  of  four  items  in  a  social 
distance  scale.  £-rati^  '  for  the  differences  between  the  adjusted  group 
means  on  the  remaining  three  items  and  on  two  additional  attitude  measures 
were  not  statistically  significant.  Dahl  et  al.  (1978)  stated  that  the 
results  "suggest  that  simulations  may  have  limited  value  for  changing  Grade  5 
students'  attitudes  toward  their  handicapped  peers"  (p.  574). 

Selective  reporting 

An  example  in  which  the  seleccive  reporting  of  findings,  perhaps 
reflec.:ing  the  selective  collection  of  data,  resulted  in  the  loss  of  valuable 
information  about  an  important  mediating  variable  comes  from  Sandler  and 
Robinson  (1981).  They  reviewed  several  studies  pertaining  to  the  effects  of 
mainstreaming  mentally  retarded  children  on  the  attitudes  of  their  non- 
disabled  -oeers  and  concluded  that  "investigations  involving  5MR  children 
integrated  into  regular  classrooms  have  consistently  shown  more  negative 
attitude  change  among  nonhandicapped  children  related  to  increased  exposure 
to  EMR  children"  (p.  98).  Among  the  studies  cited  was  Goodman,  Gottlieb,  and 
Harrison  (1972).  A  perusal  of  this  study  revealed  that  gender  was  a 
significant  factor  in  the  acceptance  of  mainstreamed  mentally  retarded 
children:  although  boys  rejected  integrated  mentally  retarded  children  more 
than  segregated  ones,  the  acceptance  or  rejection  of  mentally  retarded 
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children  by  girls  was  independent  of  educational  setting.  This  important 
relationship  was  disregarded  in  the  Sandler  and  Robinson  review. 

Home's  (1979)  review  provided  an  example  of  an  irrelevant  citation  that 
might  be  due  to  incomplete  data  collection.  Referring  to  studies  by  Kearney 
and  Rocchio  (1956)  and  leBue  (1959)/  Home  wrote,  "Studies  done  using  teacher 
populations  have  included  an  exploration  of  the  relationship  between 
information  about  the  handicapped  and  attitudes  toward  the  handicapped  .  •  • 
as  wej.1  as  the  effect  of  teaching  experience  or  contact"  (p.  63).  In  both 
studies,  the  Minnesota  Teacher  Attitude  Inventory  was  used  to  assess 
attitudes,  and  in  neither  article  was  reference  made  to  handicapped  children 
nor  was  attitude  toward  the  disabled  measured. 

Outcomes.  Another  important  aspect  of  a  research  review  is  the 
treatment  of  the  dependent  measures  used  in  the  primary  studies.  For  the 
most  part,  the  reviewers'  reports  of  data  pertaining  to  dependent  variables 
suggested  that  data  collection  was  inadequate.  Dependent  variables  were  not 
adequately  described,  nor  was  it  possible  to  discern  how  the  reviewers 
gathered  data  concerning  study  outcomes. 

In  only  three  (Haddle/  1974;  Rabkin,  1972;  Towner,  1984)  of  the  14 
reviews  were  the  means  of  assessing  dependent  variables  identified  for 
individual  studies.  Of  the  remaining  11  re'dews/  three  (Alexander  &  Strain, 
1978;  Anthony/  1972;  Segal,  1978)  reported  dependent  measures  for  several  but 
not  all  of  the  primary  studies  cited,  and  eight  (Chubon,  1982;  Donaldson, 
1980;  Harth,  1973;  Horne,  1979;  Johannsen,  1969:  Pulton,  1978;  Sandler  & 
Robinson/  1981;  Westwood  et  al.,  1981)  made  no  mention  of  hov  study  outcomes 
were  measured. 

In  addition,  the  findingr.  of  studies  were  generally  reported  only 
narratively,   suggesting  that  quantitative  data,    either  statistical 
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significance  or  effect  sizes/  were  not  collected.    There  were  two  exceptions. 

Statistical  significance  levels  were  reported  by  Haddle  (1974),  and  Towner 

(1984)  provided  summary  tables  with  studies  identified  as  successful  if 

"statistically  significant  results"  were  obtained.    Donaldson  (1980)  also 

provided  summary  tables  in  which  she  grouped  studies  according  to  "positive 

change"/  "no  change",  and  "negative  change"  (p.  506).    Whether  these  headings 

referred  to  statistical  significance  or  some  other  criterion  could  not  be 

determined  from  the  article. 

Typical  of  the  narrative  statements  reporting  the  findings  of  studies 

were  the  following: 

Quay,  Bartlett,  Wrightsman,  and  Catron  (1961)  used  three  methods  of 
presenting  materia?,  to  a  group  of  attendants.  They  found  that  the 
formal  lecture  method  was  more  effective  in  changing  reported  attitudes 
than  either  the  discussion  method  or  use  of  a  booklet.  (Rabkin^  1972/ 
p.  165) 

Voeltz  (1980)  has  demonstrated  that  providing  structural  contact 
experiences  to  elementary  school  children  is  related  to  more  positive 
attitude  toward  integrated  severely  handicapped  students.  (Sandler  & 
Robinson,  1981,  p.  99) 

Evans  (1976)  has  also  reported  that  interaction  with  disabled  persons 
which  was  structured  to  alleviate  interpersonal  discomfort  was  effective 
in  enhancing  the  attitudes  of  psychology  students,  and  Hersh,  Carlson, 
and  Lossino  (1^75)  found  that  social  work  students'  attitudes  towards 
retarded  persons  seemed  to  be  enhanced  by  interaction  with  families 
having  retarded  children.    (Chubon,  1982,  p.  27) 

Negative  changes  in  attitude  toward  and  optimism  concerning  the 
integration  of  exceptional  learners  were  found  by  Fenton  (1975)  and 
Shotel  et  al.  (1974).     (Towner,  1984,  p.  236) 

In  another  summer  workshop,  administrators  and  classroom  teachers  were 
involved  in:  (1)  a  practicum  placement  working  with  handicapped 
children  for  3  hours  per  week;  (2)  two  additional  hours  of  observation; 
(3)  enrollment  for  9  hours  of  graduate  credit;  and  (4)  weekly 
sensitivity  sessions.  Post-testing  using  a  semantic  differential  showed 
significant  and  positive  attitudinal  changes  (Brooks  &  Brans ford,  1971). 
(Home,  1985,  p.  158) 

The  15  reviews  were  replete  with  similar  statements  in  which  effect  sizes 
were  not  given  or  statistical  tests  or  significance  levels  were  not  reported. 
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One  of  the  dangers  of  not  collecting  information  on  levels  of 
statistical  significance  so  that  it  can  be  reported  in  statements  alluding  to 
positive  findings  in  primary  studies  is  that  the  reader  may  be  tempted  to 
assume  that  statistical  significance  was  reached.  In  five  of  the  14  reviews^ 
such  a  conclusion  could  be  erroneous.  For  example/  Home  (1979)  referred  to 
a  specific  study  to  support  her  advocacy  of  programs  designed  to  modify  the 
behaviors  of  dj?"'bled  students  which  inhibited  their  acceptance  by 
nondisabled  peers  and  programs  to  "facilitate  acceptance  on  the  part  of 
regular  classroom  members"  (p.  64).  She  commented;  "Although  few  such 
programs  have  been  presented  in  the  literature/  Simpson/  Parrish/  and  Cook 
(1976)  have  demonstrated  the  efficacy  of  us. ng  such  procedures  with 
elementary  school  pupils"  (Horne/  1979/  p.  64).  The  article  by  Simpson  et 
al.  described  two  studies/  the  first  conducted  with  primary  children  and  the 
second  with  grade  5  students.  Tne  designs  for  both  studies  included  control 
groups/  and  the  Attitude  Toward  Disabled  Persons  Scale  was  used  to  assess 
attitudes.  The  t;-test  vas  used  to  analyze  the  differences  between  means  for 
the  primary  children.  The  difference  between  the  posttest  scores  of  the 
experimental  and  control  groups  was  not  statistically  significant.  AISDVA  was 
used  to  analyze  the  differences  among  the  mean  change  scores  of  the  three 
''ifth  grade  experimental  groups  and  the  one  control  group  (or  two  control 
groups — the  report  is  unclear).  The  £--value  was  not  statistically 
significant.  Home's  statement  might  lead  readers  who  did  not  have  access  to 
the  primary  report  to  assume  that  the  results  were  statistically  significant. 

A  second  example  comes  from  Anthony  (1972).  He  referred  to  a  study 
reported  by  Cowen/  Underberg/  and  Verrillo  (1958)  with  the  comment/  "These 
researchers  found  that  individuals  who  had  had  contact  with  the  blind  tended 
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to  hav;  more  negative  attitudes  than  individuals  reporting  no  contact"  (p. 
118).  Coweu  et  al.  reported  developmental  data  for  the  Attitude  Toward 
Bl indness  Scale.  Their  sample  consisted  of  university  students  in 
educational  psychology  courses  who  were  divided  into  groups  according  to 
reported  contact  or  no  reported  contact  with  blind  persons.  The  mean  score 
for  the  contact  group  was  54.53  (SD  =  12.21)/  while  the  mean  score  for  the  no 
contact  group  was  53.58  (SD  =  12.21).  Not  only  is  the  difference  visibly 
small,  especially  relative  to  the  standard  deviation,  but  the  it  value  for  tne 
difference  between  the  means  of  the  two  groups  was  0.39,  clearly  not 
statistically  significant  (p.  300).  In  contrast  with  Anthony's  comment, 
Cowen  et  al.  concluded  that,  .  .  in  the  present  study,  though  clearly  not 
significant  [underlining  added],  there  is  a  slightly  higher  mean  score  (more 
negative  attitudes)  reported  by  those  who  have  had  previous  contact  with  the 
blind"  (Cowen  et  al*,  1958,  p.  300). 

Comments.  The  failur-e  to  describe  data  collection  procedures  was  common 
to  the  15  reviews*  The  problems  in  the  way  data  were  presented  that  seemed 
likely  to  reflect  inadequate  data  collection  included  the  following:  (1) 
studies  were  grouped  together  under  summary  statements  that  were  not  accurate 
for  all  of  the  studies  cited  and  that  ignored  differences  in  independent  and 
dependent  variables;  (2)  studies  were  cited  with  no  information  provided 
about  treatment  variables:  (3)  comments  in  the  conclusion  or  summary  sections 
of  primary  articles  were  referred  to  as  if  they  were  finoip.gs  from  the 
studies;  (4)  program  descriptions  were  referred  to  as  though  they  were 
research  studies:  (5)  studies  were  presented  inaccurately,  including  reports 
of  partial  findings  and  the  citation  of  irrelevant  studies. 

In  addition,  narrative  reporting  of  the  findings  from  primary  studies 
was  common  to  the  reviews.    Only  two  of  the  14  reviews  even  reported  whether 
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primary  results  were  statistically  significant.  In  some  reviews/  findings 
were  reported  in  a  manner  that  implied  statistical  significance  when,  in 
fact/  significance  had  not  been  reached. 

Analyzing  the  Data  Collected  from  Primary  Studies 

As  with  primary  research/  after  data  collection  the  next* stage  in 
conducting  a  review  of  research  is  the  analysis  of  tne  data  collected  from 
primary  studies.  The  analysis  is  the  basis  for  the  reviewer's  inferencvjs 
from  the  findings.  Analysis  is  complicated  by  a  number  of  factors  (e.g./ 
sample  characteristics/  design/  intervention  implementation/  the  assessments 
used)  that  may  vary  with  outcomes  and  should  be  considered  when  attempting  to 
draw  conclusions  (Jackson/  1980). 

Important  decisions  that  could  significantly  affect  the  quality  of  the 
review  face  the  reviewer  during  the  analysis  of  data  stage.  Among  the 
matters  that  must  be  decided  are  how  to  attempt  to  get  at  relatic  iships 
between  outcomes  and  the  concomitant  variables  that  might  have  confounoed 
findings/  how  to  treat  the  results  from  studies  with  multiple/  even 
conflicting/  outcomes/  how  to  weight  studies  that  vary  in  degree  of 
methodological  soundness/  and  what  analyses  to  conduct  to  explain 
contradictory  findings  among  primary  studies? 

Concomitant  variables.  None  of  the  15  reviews  being  critiqued  in  this 
chapter  examined  the  effects  of  concomitant  variables  such  as  age  of  subjects 
and  gender.  As  will  be  noted  later/  Towner  (1984)  did  at  least  examine 
methodological  deficiencies/  even  though  she  did  not  relate  them  directly  to 
differences  in  outcomes.  Most  of  the  15  reviews  deaiL  with  simple  treatment- 
dependent  variable  relationships/  despite  the  number  of  sample  attributes 
(e.g./  age/  sex,  intelligence/  education)  and  intervention  characteristics 


(e.g./  setting/  length  of  treatment)  that  might  have  been  related  to  changes 
in  attitudes  toward  disabled  persons.  That  approach  mighv-  have  been  in  part 
a  reflection  of  the  design  and  the  analysis  strategies  in  many  of  the  primary 
studies  which  were  reviewed.  Particularly  in  the  earlier  ones/  the  t-test  was 
often  first  used  to  test  the  difference  between  the  pretest  means  of  the 
experimental  and  control  groups;  if  no  statistically  significant  pretest 
difference  was  found/  then  the  t-test  was  used  to  test  pre-post  mean 
differences  for  each  group.  If  the  experimental  group  mean  gain  was 
statistically  significant  and  the  concrol  qroup  mean  gain  was  not,  it  was 
concluded  that  the  difference  between  the  two  was  significant.  When  the 
posttest  means  iov  treatment  groups  were  compared/  using  the  ^-test  or  one- 
way analysis  of  variance/  other  factors  that  might  have  had  a  bearing  on  the 
outcomes  were  usually  not  included  in  the  analysis. 

Even  when  the  primary  studies  involved  complex  designs  with  complex 
analyses/  however/  the  tendency  was  to  report  the  findings  of  these  studies 
in  simple  treatment-outcome  terms.  The  reference  to  Hafer  and  Narcus's 
(1979)  study  in  Westwood  et  al.  (1981)  was  typical  of  the  treatment  of 
complex  findings  in  most  of  the  reviews. 

The  purpose  of  the  Hafer  and  Narcus  study  was  to  investigate  the  effects 
on  'S  college  students'  attitudes  toward  disabled  persons  of  viewing  a  film 
(Like  Other  People)  designed  to  present  the  needs  and  feelings  of  persons 
with  cerebral  palsy  as  similar  to  those  of  nonhandicapped  people/  as  compared 
to  viewing  a  Laurel  and  Hardy  comedy.  Whether  pretesting  would  have  an 
effect  was  also  investigated.  The  Attitude  Toward  Disabled  Persons  scale 
(ATDP)  was  ad'inistered  to  two  of  the  four  randomly  assigned  groups  as  a 
pretest.     All  groups  took  the  ATDP  immediately  following  the  viewing  of  one 
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of  the  two  films.  Six  weeks  later,  the  ATDP  was  administered  again  to  the 
total  sample. 

Multiple  classification  analysis  of  variance  revealed  that  there  was  a 
statistically  significant  mean  difference  between  the  groups  which  saw  the 
two  films,  with  the  Laurel  and  Hardy  group  having  the  higher  (more  positive) 
mean*  There  also  were  statisticrally  significant  interactions  between  the 
"specific  movie  seen  and  whether  a  group  was  pretested  or  not"  and  between 
"specific  movie  and  time  of  administration  of  the  posttest"  (p.  99). 
Nonpretested  students  who  saw  the  two  movies  had  similar  posttest  means,  but 
pretested  students  had  a  higher  mean  with  the  Laurel  and  Hardy  movie  than 
^^^^  Like  Other  People.  The  mean  difference  between  viewers  of  the  two  film£2 
detected  at  the  first  posttest  was,  however,  not  evident  at  the  follow-up 
testing.  Despite  this  complexity,  the  Hafer  and  Narcus  study  was  reported  as 
follows  in  the  Westwood  et  al.  (1981)  review:  "[0]ther  studies  involving 
college  students  (Hafer  &  Narcus,  1979;  Wyrick,  1968;  Yerxa,  1971),  nursing 
students  (Rosswurm,  1980),  high  school  students  (Forader,  1970),  and  grade 
school  students  (Perkins-Komiski.-  1978)  also  produced  equivocal  results  with 
three  showing  positive  change  (Perkins-Karniski,  1978;  Rosswurm,  1<>80;  Yerxa, 
1971)  and  three  showing  no  change  (Forader,  1970;  Hafer  &  Narcus,  1979; 
Wyrick,  1968)"  (p.  221). 

Study  quality.  As  was  indicated  in  the  foccnote  on  page  20,  o£  the  192 
primary  studies  that  were  cited  in  the  seven  full  and  eight  brief  reviews, 
119  were  investigations  of  the  efficacy  of  interventions  for  modifying 
attitudes  toward  disabled  persons.  An  additional  24  studies  assessed  the 
effects  of  university  programs  and  courses  on  stuients*  attitudes  toward  the 
disabled.     These  143  studies  comprised  74%  of   the   individual  studies 
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referenced  in  the  15  reviews.  As  part  of  the  integrative  review  to  be 
reported  in  the  following  chapters/  114  (about  80  %)  of  the  143  studies  were 
coded  for  treatment  validity  and  internal  validity/  using  the  coding 
instrument  presented  in  Appendix  B.  The  two  types  of  validity  were  coded  for 
the  other  20  percent  this  review  of  reviews. 

None  of  the  143  primary  studies  was  judged  to  be  "excellent"  in 
treatment  validity  (Category  C,9.D.)  or  "high"  in  internal  validity  (Category 
E.3.i.).  The  treatment  validity  of  55%  (N  =  79)  of  the  studies  was  coded  as 
"fair"/  and  that  of  the  remaining  45%  (N  =  64)  was  coded  "poor".  Major 
♦•hreats  to  treatment  validity  were  multiple  treatment  interference/ 
shortcomings  in  treatment  implementation/  test  by  f'eatment  interaction/  and 
experimenter  effects.  The  internal  validity  of  36%  (N  =  52)  of  the  studies 
was  judged  to  be  "medium";  that  of  the  remaining  64^  (N  =  91)  was  coded  as 
"low".  Selection/  history/  and  instrumentation  were  found  to  be  serious 
threats  to  internal  validity  in  many  studies. 

Contrary  to  the  alcove  findings/  only  a  few  reviewers  expressed  concern 
for  the  general  quality  of  the  primary  studies  they  cited.  Towner's  (1984) 
conclusion  that  the  findings  of  many  of  these  studies  "can  only  be 
characterized  as  contaminated"  (p.  251)  was  in  accord  with  earlier  comments 
by  Anthony  (1972)/  Chubon  (1982)/  Donaldson  (1980)/  and  Haddle  (1974). 
However/  only  Towner  (1984)  identified  the  studies  she  judged  co  be 
methodolovjically  weak  and  attemftecl  to  examine  systematically  the 
methodological  deficiencies  in  the  studies.  Anthony  limited  his  criticism  to 
contact  studies  in  which  the  independent  variable  was  the  self-reports  of 
respondents  as  to  the  amount  of  their  contact  with  disabled  persons.  He 
argued  that  such  studies  were  "methodologically  deficient"  (1972/  p.  119)/ 
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because  each  subject  decided  for  himself  or  nerself  just  what  was  2ant  by 
contact.  Donaldson  questioned  the  general izability  of  several  studies  she 
considered  to  be  poorly  designed,  but  there  was  no  indication  that  these 
studies  were  assigned  a  low  rating  when  the  effectiveness  of  particular 
interventions  was  weighed.  Similarly,  although  Chubon  (1932)  and  Haddle 
(1974)  commented  that  many  studies  had  weak  designs,  they  did  not  identify 
specific  studies.  It  was,  therefore.,  difficulc  to  determine  whether  adequacy 
of  design  was  weighted  in  their  discussions  of  findings. 

Theory  base.  Whether  outcones  vary  between  studies  with  and  without  a 
theoretical  focus  is  an  important  analysis  question.  Of  224*  reports  of 
primary  research  coded  for  the  meta-analysis  reported  in  following  chapters, 
only  20%  (N  =  44)  explicitly  identified  the  attitude  change  theories  upon 
which  the  interventions  were  based.  Only  five  of  the  reviews  (ChubDn/  1982; 
Donaldson,  1980;  Haddle,  1974;  Harth,  1973;  Towner,  1984)  mentioned  this 
shortcoming.  And,  with  the  exception  of  Towner,  comments  concerning  the 
failure  of  most  researchers  to  ground  their  studies  in  theory  were  general  in 
nature;  studies  that  lacked  a  theoretical  base  were  not  specifically 
identified.  Whether  this  lack  of  a  theoretical  base  for  the  primary  studies 
was  considered  in  drawing  the  conclusions  in  these  reviews  is  unclear. 

Attitude  assessment.  Whether  outcomes  covary  with  quality  of  outcome 
assessment  is  another  pertinent  analysis  question.  Critical  comments 
concerning  the  assessment  of  attitudes  were,  however,  infrequent  in  the 
reviews,  and  relationships  between  instrumentation  and  outcomes  were  not 
discussed.     Although  six  reviewers  (Alexander  &  Strain,  1978;   Anthony,  1972: 


*This  number  excludes  studies  of  mainstreaming  in  classrooms  and  studies  for 
which  the  information  to  compute  effect  sizes  was  not  available. 
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Haddle,  1374;  Pxabkin,  1972;  degal,  1978;  Towner,  1934)  identified  some  or  ill 
tests  used  in  the  primary  studies  they  examined,  T.  .ler  alone  pointed  out 
inadequcicies  in  the  reporting  of  assessment  in  many  primary  studies.  In 
particular/  Towner  criticized  the  lack  of  reliability  and  validity  data  in 
the  instrumentation  sections  of  most  primary  research  reports.  Furthermore, 
Towner  noted  that  instruments  developed  for  individual  studies  were  often 
poorly  described  and  test  development  data  were  seldom  given.  She  did  not, 
iow»?ver,  use  that  information  in  analyzing  the  variance  among  study  outcomes. 
Chubon  (1982)  also  criticized  the  means  for  assessing  attitudes  in  the 
studies  he  reviewed/  buL  his  comments  were  general/  with  references  to 
neither  specific  studies  or  tevts.  Nor  did  Chubon  specifically  consider 
weaknesses  in  instrumentation  ^hen  he  judged  the  effectiveness  of  the 
interventions  he  reviewed. 

A  perusal  of  the  primary  studies  included  in  the  15  reviews  disclosed 
numerous  examples  of  poorly  designed  and  inadequately  reported  assessment. 
Much  like  the  primary  reports  coded  for  the  integrative  review  reported  in 
the  following  chapters*/  validity  and  reliability  data  we  j  infrequently 
reported  in  the  primarv  studies  included  in  the  15  reviews.  This  was  true 
even  for  those  instruments  such  as  the  Opinions  A .  .dt  Mental  Illness  scale 
(OMI)  and  the  Attitude  Toward  Disabled  Persons  scale  (ATDP)  for  which 
extensive  documentation  exists.  The  failure  to  provide  satisfactory 
descriptions  and  developmental  data  for  tests  prepared  for  sf.  '.c  studies 
was  common/  too.    Examples  of  the  latter  problem  were  found  in,  among  others/ 


*With  704  effect  sizes  computed  for  214  studies/  reliability  coefficients  for 
the  attitude  assessments  were  reported  for  411/  or  58%;  validity  was 
mentioned  for  330/  47%/  of  the  704  effect  sizes. 
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Harasymiw  and  orne  (1975),  Quay,  Bartlett,  Wrightsman,  and  Catron  (1961), 
Shotel,  lano,  and  McGettigan  (1972),  and  Stephens  and  Braun  (1980). 
Nevertheless,  the  reviewers  who  referenced  several  (Chubon,  1982;  Donaldson, 
1980;  Harth,  1973;  Rabkin,  1972)  or  all  (Sandler  &  Robinson,  1981)  of  these 
studies  apparently  overlooked  these  deficiencies  in  their  analyses  of  study 
findings. 

Conflicting  results.  Table  3  records  the  results  from  the  primary 
studies  that  were  included  in  the  rev'ews,  -arranged  according  to  review 
authors  and  attitude  change  strategies.  As  would  be  expected,  most  reviewers 
included  studies  in  which  the  interventions  were  deemed  to  be  successful  in 
modifying  attitudes  toward  the  disabled  and  studies  which  produced 
nonsignificant  or  negative  findings.  Deciding  how  to  handle  discrepant 
findings  is  an  essential  part  of  the  analysis  of  data  for  a  review,  and  has 
been  the  .'subject  of  a  number  of  articles.  Light  and  Pillemer  (1982)  proposed 
that  contradictory  findings  provide  the  reviewer  with  "an  opportunity  to 
examine  and  explain  variations  in  outcomes"  (p.  6).  In  an  earlier  article. 
Light  and  Smith  (1971)  argued  that  to  ignore  discrepant  findings  is  to 
"assume  that  genuinely  contradictory  results  can  never  be  a  valid  descripcion 
of  reality"  (p.  438). 

Systematic  approaches  to  the  examination  of  conflicting  findings  were 
described  in  Jackson  (1980),  Ladas  (1980),  and  Light  and  Smith  (1971).  They 
suggested  that  conflicting  findings  may  rest'  from  a  number  of  factors, 
including  sampling  error,  grouping  under  the  same  attitude  change  strategy 
studies  with  different  intervention  characteristics,  methodological 
inadequacies,  and  instrumentation.  The  implication  is  that  the  reviewer  is 
responsible  for  attempting  to  explain  divergent  findings  from  a  set  of 
primary  studies. 
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^For  cheoe  »tudxe9.  "he  author  did  not  xndxcate  whether  the  findings  i/erc  positive  or  negative. 
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In  few  of  the  15  reviews  was  there  evidence  of  systematic  analysis  of 
factors  that  might  account  for  contradictory  findings  among  primary  studies 
or  consideration  of  such  findings  as  an  opportunity  to  advance  knowledge 
about  attitude  modification*  In  three  re"iews  (Alexander  &  Strain,  1978; 
Horne,  1979;  Segal,  1978)  only  successful  studies  were  cited,  so 
discrepancies  could  not  be  examined.  Of  the  remaining  nine  reviews,  however, 
only  three  provided  explanations  for  differing  findings* 

Towner's  (1984)  examination  of  non-significant  and  negative  findings  was 
the  most  adequate  attempt  to  explain  discrepant  findings.  She  discussed  the 
covariation  of  outcomes  with  methodological  soundness  and  theoretical 
underpinnings  and  concluded  that  unsuccessful  results  could  not  be  attributed 
"to  any  single  factor"  (p.  252)*  She  suggested  that  poor  reports  of  studies, 
weak  methodologies,  and  failure  to  ground  interventions  in  theory  inhibited 
the  drawing  of  conclusions  about  the  effectiveness  of  particular  attitude 
change  strategies* 

Although  they  reviewed  fewer  studies  than  Towner  (1984)  and  their 
analyses  of  contradictory  findings  were  less  thorough,  Anthony  (1972)  and 
Donaldson  (1980)  were  less  cautious  in  their  explanations  of  these  findings. 
Anthony's  discussion  pertained  only  to  differencos  among  studies 
investigating  contact  as  a  change  strategy,  since  studies  cited  for  the 
knowledge  change  technique  and  the  contact  plus  knowledge  technique  were 
considered  either  all  successful  or  all  unsuccessful.  He  noted  that  in 
correlat ionuu.  studies  of  the  relationship  between  reported  contact  and 
attitude  toward  the  disabled,  there  was  a  tendency  to  find  positive 
associations;  whereas  in  experimental-type  studies  in  which  contact  was  the 
independent   variable,    the   findings   were   frequently  nonsignificant  or 
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negative.  Anthony  suggested  that  this  might  have  been  due  to  the  effects  of 
self-reports  of  contact.  Such  estimations,  he  argued,  are  highly  subjective, 
with  the  definition  of  "contact"  usually  left  up  to  the  individual  subject. 
He  questioned  the  validity  of  contact  as  an  independent  variable  in  these 
studies,  in  contrast  to  studies  in  which  the  amount  or  type  of  contact  was 
controlled  by  the  researcher. 

Donaldson  (1980)  did  discuss  discrepant  findings.  In  regard  to  contact 
as  a  change  treatment,  she  observed  that  studies  tended  to  yield  successful 
results  when  contact  was  "structured"  to  expose  subjects  to  disabled  persons 
who  behaved  in  a  non-stereotypic  manner.  However,  no  explanation  was  given 
for  the  positive  results  from  some  studies  in  which  "unstructured"  con^act 
was  used. 

Donaldson  (1980)  also  addressed  the  inconsistencies  in  findings  among 
studies  in  which  information  was  used  as  the  change  agent.  Using  as  her  frame 
of  reference  a  Lewinian  model,  in  which  discomfort  reduction  is  posited  as 
the  mechanism  for  explaining  modified  attitudes,  she  noted  that  all  of  the 
successful  studies  attempted  to  "put  subjects  at  greater  ease  through  verbal 
messages  or  subtly  sanctioned  staring" (p.  510).  Reduction  of  discomfort, 
she  maintained,  was  not  a  feature  of  the  unsuccessful  studies.  The 
unsuccessful  studies  cited  were  Cole  (1971),  Forader  (1970),  Granofsky 
(1956),  Wallston,  Blanton,  Robinson,  and  Pollchinck  (1972),  and  Wyrick 
(1968).  It  is  diiJficult  to  determine  whether  discomfort  reduction  was  or  was 
not  a  feature  of  several  of  these  studies.  The  descriptions  of  Cole  and 
Granofsky's  studies  in  Dissertation  Abstracts,  which  Donaldson  cited,  are  too 
brief  for  that  purpose.  The  reference  to  Wyrick's  thesir*  was  based  on  a 
single  sentence  in  Evans  (1976)  which  stated,  "Wyrick  (1968)  assessed  the 
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effects  of  a  course  in  rehabilitation  psychology  on  the  attituder^  of 
nondisabled  undergraduates  and  found  no  significant  change"  (p.  ^73). 
Wallscon  et  al.  (1972)  was  not  available  to  us. 

In  addition/  Donaldson  (1980)  attempt<?d  to  account  for  the  differences 
in  findings  between  the  two  simulation  studies  she  reviewed.  She  noted  that 
the  successful  intervention  permitted  subjects  in  wheelchairs  to  observe  the 
reactions  of  nondisabled  persons  to  persons  (themselves)  who  were  apparently 
disabled.  During  the  unsuccessful  simulation/  the  nondisabled  persons  were 
obvic/usly  simulating  disabilities  (e.g./  wearing  a  blindfold).  Donaloson 
contended  that  it  was  the  difference  in  the  subjects*  opportunity  to  observe 
the  reactions  of  nondisabled  persons  to  them  as  handicapped  rather  than  as 
role  players  that  resulted  in  the  discrepant  findings/  although  the  reason 
for  the  effect  was  not  discvosed. 

Characterizing  results.  Whether  there  are  discrepancies  in  findings  or 
not/  the  reviewer  must  characterize  and  summarize  results  as  a  basis  for 
drawing  conclusions  about  the  effectiveness  of  particular  types  of 
interventions.  Towner  (1984)  clearly  used  statistical  significance.  The 
methods  of  analysis  used  for  that  purpose  were  not  described  in  any  of  the 
other  reviews.  It  seems  reasonable  to  infer  from  the  contexts  of  the 
reviews/  however/  that  Anthony  (1972)/  Haddle  (1974)/  Segal  (1978)/  and/ 
perhaps/  Rabkin  (1972)  employed  the  box-score  or  voting  method/  in  which 
statistically  significant  and  nonsignificant  results  are  summed/  to  cumulate 
findings. 

Commepcs.  A  number  of  deficiencies  in  the  analysis  of  data  seriously 
threaten  the  quality  of  most  of  the  reviews.  Information  pertaining  to 
relationships  among  sample  and  intervention  characteristics  was  frequently 
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lost  because  of  the  general  tendency  to  consider  the  findings  of  complex 
studies  in  simple  treatment-outcome  terms.  Furthermore,  wich  several 
exceptions/  there  v/as  a  lack  of  attention  to  internal  validity/  and  poorly 
and  well-designed  studies  were  accorded  equal  status  in  the  discussion  of 
interventions.  Poorly  designed  and  inadequately  reported  assessment  of 
attitudes  also  escaped  attention  in  most  reviews.  In  fact/  in  several 
reviews/  no  mention  was  made  of  the  means  used  in  primary  studies  to  assess 
attitudes.  Few  reviewers  attempted  to  explain  contradictory  findings  among 
primary  studies/  and  in  no  review  was  this  done  in  a  systematic  manner.  It 
appeared  that  the  box-score  or  voting  approach  was  used  to  arrive  at  an 
overall  judgment  of  the  efficacy  of  strategies  in  several  of  the  reviews. 

Integrating  and  Reporting  the  Findings 

If/  as  Jackson  (1980)  suggested/  the  "methodology  of  primary  research" 
can  be  "used  to  conceptualize  the  methodology  of  integrated  reviews"  (p. 
442)/  then  the  final  section  of  a  review  is  analogous  to  the  *t)iscussion  and 
Conclusions"  section  in  a  report  of  a  primary  study.  It  is  expected  that 
here  the  results  of  the  strdy  will  be  discussed  in  terms  of  the  original 
hypotheses  or  questions/  and  practical  and  theoretical  implications  drawn 
from  these  results.  Here,  also/  the  limitations  of  the  study  are  to  be 
identified  and  recommendations  for  further  research  made. 

The  headings  for  final  sections  in  11  of  the  15  reviews  clearly 
identified  them  as  summaries  or  conclusions.  In  fact/  the  term/ 
"conclusion"/  was  included  in  the  headings  of  final  sections  in  Anthony 
(1972)/  Home  (1979),  Johannsen  (1969)/  and  Segal  (1978)/  while  "summary"  was 
used  in  Haddle  (1974)/  Harth  (1973)/  Johannsen  (1969)/  Rabkin  (1972)/  and 
Towner  (1984).     The  word/   "implications"/   was  used  in  the  headings  of 
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concluding  sections  in  Anthony  (1972),  Donaldson  (1980),  and  Westwood  et  al. 
(1981).  One  brief  review  (Alexander  &  Strain,  1978)  contained  no  final 
conclusions  section;  one  (Pulton,  1976)  did  not  summarize  the  research  in  the 
final  section,  but  proposed  an  attitude  change  strategy;  and,  Sandler  & 
Robinson  (1981)  made  recommendations  that  were  not  based  on  their  own  review 
of  primary  studies.  Conclusions  as  to  the  effectiveness  of  change  strategies 
were  drawn  in  most  of  the  15  reviews,  even  though  often  briefly  stated.  They 
are  summarized  in  Table  4. 

The  practical  implications  of  the  conclusions  drawn  by  the  reviewers  are 
open  to  question.  Most  reviewers  who  identified  effective  strategies  for 
modifying  attitudes  toward  the  disabled  did  so  only  in  general  terms. 
Typical  of  these  reviewers  were  Anthony  (197?),  Harth  (1973),  Home  (1979)/ 
and  Segal  (1978).  Anthony,  for  example,  concluded  that  "attitudes  of 
nondisabled  .  .  .  can  be  influenced  positively  by  providing  ...  an 
experience  which  includes  contact  with  disabled  persons  and  information  about 
the  disability"  (p.  123).  In  a  similar  vein,  Harth  proposed  that  "rather 
direct,  well  organized  procedures  are  required"  for  "bringing  about 
significant  positive  changes  in  attitudes"  (1973,   p.  161). 

The  exceptions  were  Donaldson  (1980)  and  Sandler  and  Robinson  (1981)  who 
offered  specific  suggestions  that  would  be  useful,  though  not  necessarily 
sufficient,  to  readers  seeking  information  on  which  to  base  attitude  change 
programs.  The  remaining  two  reviewers  (Haddle,  1974;  Towner,  1984)  indicated 
that  the  evidence  was  too  inconclusive  to  support  the  recommendation  of  any 
intervention  as  most  effective  for  enhancing  attitudes  toward  disabled 
persons. 

In  the  reviews  in  which  conclusions  about  the  effectiveness  of  change 
strategies  were  not  included  in  a  concluding  section,   inferences  were  stated 
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following  discussions  of  sf)ecific  interventions  (Rabkin,   1972;   Westwood  et 
p  al.,   1981),    included  in  introductions  to  such  discussions  (Johannsen,  1969; 

Pulton/  1976)/  or  sometimes  embedded  within  the  discussions  of  specific 

interventions  (Horne/  1985),     Suggestions  for  attitude  change  programs  based 
^  on  these  findings  were  also  stated  as  broad  generalizations. 

The  practical  significance  of  the  recommendations  in  the  reviews  is 

uncertain/    not   only  because   of   the  general    manner   in   which  the 
§  recommendations  were  staled  but  also  because  the  validity  of  the  reviewers' 

data  collection  methods  can  be  seriously  questioned.    In  most  of  the  reviews/ 

such  problems  as  the  lack  of  representativeness  of  samples  of  primary 
i  studies/  the  grouping  of  primary  studies  into  loosely  defined  categories/  and 

the  tendency  to  ignore  complex  designs  threatened  the  usefulness  of  the 

recommendations. 

» 

Theoretical  Implications 

Attitude  change  theory  might  have  been  advanced  had  the  theoretical 
^  bases  for  or  implications  of  the  interventions  been  examined/   but  few 

reviewers  did  so.     In  fact/  research  results  were  discussed  in  terms  of 

attitude  theories  in  the  conclusions  sections  of  only  two  reviews  (Pulton/ 
^  1976;  Towner/  1984). 

Pulton  (1976)  utilized  role  conflict  theory  to  explain  attitude  change 

following  simulations  of  marginal  disabilities  in  one  study  (Clore  &  Jeffrey/ 
p  1972).     He  suggested  that  nondisabled  subjects  who  role  play  being  a  disabled 

person  experience  both  inter-  and  intra-role  conflict.     Inter- role  conflict 

occurs  because  nondisabled  subjects  who  simulate  disabling  conditions  must 
I  behave  as  if  disabled  while  continuing  to  be  nondisabled  persons.  Intra-role 

conflict  occurs  when  significant  others  demand  "immediate  explanations"  for 

CO 

ERIC  55 


the  apparent  disability  or  when  nondisabled  persons  expect  eabjects  to  be 
competent  in  managing  their  disa*-^ilities.  Pulton  argued  that  subjects  who 
role  play  marginal  disabilities  experience  more  psychological  strain  than 
subjects  who  role  play  severe  disabilities  since  marginally  disabled  persons 
have  more  behavioral  choices  available  to  them,  and  hence  face  more 
conflicts.  These  conflicts/  he  maintained/  result  in  greater  acceptance  of 
disabled  persons.  Consequently/  Pulton  advocated  "emotioral  role  playing"  as 
a  means  for  enhancing  attitudes  toward  the  disabled.  Pulton  did  t*ot  discuss 
the  theoretical  underpinnings  of  the  studies  he  reviewed  which  employed 
contact  and  knowledge  as  change  strategies, 

To/ner's  (1984)  discussion  of  attitude  theory  was  thorough  and 
systematic.  She  extrapolated  elements  from  attitude  theory  to  use  in 
reviewing  each  of  47  studies/  and  she  concluded  that  most  successful 
approaches  were  applications  of  the  attitude  change  theory  proposed  by 
Hovland/  Janis/  and  Kelley  (1953),  This  theory  has  been  labeled  in  the 
literature  (Kiesler/  Collins/  &  Miller/  1969;  Insko/  1967)  as  "stimulus 
response"  or  "behavior ist ic'*  theory.  Because  Towner  found  so  few 
methodologically  sound  studies  among  the  47  she  examined/  she  argued  t^at  her 
findings  that  success  appeared  to  be  related  to  the  number  of  theoretical 
elements  employed  in  a  study  should  be  considered  tentative. 

One  other  reviewer  (Donaldson/  1980)  cited  specific  attitude  theories  to 
explain  the  efficacy  of  particular  interventions  as  change  strategies. 
Although  Donaldson  made  no  mention  of  theory  in  her  conclusions  section/  she 
included  a  section  in  her  review  entitled;  "Theoretical  Models".  Here/ 
Donaldson  suggt.sted  that  together  Lewin's  theory  of  attitude  change  and  the 
theory  described  in  Communication  and  Persuasion  (Hovland  et  al./   1953)  best 
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explained  attitude  change  in  six  of  the  studies  she  examined.  She  failed  to 
explain  why  subjects  iri  five  studies  that  apparently  did  not  fit  this  model 
also  experienced  enhanced  attitudes  toward  the  disabled.  And/  as  noted 
earlier  (p.  [46])/  it  was  difficult  to  determine  that  elements  of  these 
theories  were  absent  from  the  unsuccessful  studies  she  cited. 

Qualifying  Conclusions 

Of  the  15  reviewers/  only  Anthony  (1972)/  Chubon  (1982)/  and  Towner 
(1984)  qualified  their  conclusions  by  reference  to  the  primary  studies  they 
reviewed.  Both  Chubon  (1982)  and  Towner  (1984)  cited  weaknesses  in  the 
design  and  instrumentation  of  primary  studies  as  limiting  factors  in  the 
generalizability  of  their  findings.  Due  to  the  general  Idck  of 
methodological  soundness  in  the  studies  reviewed,  Chubon  referred  to  his 
findings  as  "soft  or  preliminary"  (1982/  p.  29)  a..d  Towner  chat'acterized  her 
findings  as  "contaminated"  (1984/  p.  251).  Anthony  (1972)  suggested  that  his 
conclusions  pertaining  to  contact  and  information  were  limited  because  of  the 
restrictive  nature  of  the  samples  used  in  the  primary  research  studies.  He 
noted  that  ^'college  students  who  volunteered"  and  "trainees  in  the  helping 
professions"  (p.  123)  comprised  most  sample ^  H?s  conclusion  that  contact 
plus  information  was  an  effective  attitude  change  strategy  could  not  be 
generalized  beyond  these  groups. 

Research  Recommendations 

Recommendations  for  future  research  were  included  in  the  results  of 
seven  of  the  15  reviews.  These  recommendations  varied  from  simple  pleas  for 
"successful  experiments"  (Pulton/  1976/  p.  87)  and  for  researchers  to 
"coalesce  the  disparate  findings  [of  primary  studies]  and  to  build  upon  the 
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work  of  one  another"  (Chubon,  1982,  p.  29),  to  multiple  suggestions  for 
designing  studies  and  selecting  and  reporting  the  instruments  used  to  assess 
attitudes  (Towner,  1984). 

The  most  frequently  mentioned  recommendation  was  to  base  research  on 
theory  and  to  design  research  to  test  competing  theories  of  attitude  change 
(Harth,  1973;  Donaldson,  1980;  Towner,  1984).  Furthermore,  Donaldson  and 
Towner  suggested  that  future  research  should  examine  the  relationship  between 
attitude  and  behavior/  as  well  es  -^adress  the  question  '  :  the  long  term 
effects  of  attitude  change  interventions.  Donaldson  also  proposed  that 
future  studies  should  be  directed  toward  investigating  social  forces  that 
encourage  the  devaluation  of  disabled  persons. 

Research  designed  to  assess  the  effectiveness  of  a  variety  of  media  for 
presenting  persuasive  communications  was  advocated  by  Donaldson  (1980)  and 
Johannsen  (1969).  Donaldson  also  recommended  research  to  explore  the 
differential  effects  of  live  versus  media  presentations  of  nondisabled 
persons  interacting  with  disabled  persons  in  a  nonstereotypic  and  positive 
manner  in  studies  oC  desensi tization  and  modeling  as  attitude  change 
approaches.  Additionally,  she  called  for  research  to  identify  and  anal^^e 
the  factors  that  contribute  to  positive  attitudes  in  university  courses/ 
integrated  settings/  and  disability  simulations. 

Alexander  ana  Strain's  (1978)  recommendations  for  future  research 
reflected  their  interest  in  modifying  teachers*  attitudes  toward  disabled 
students  and  mainstreaminq.  They  favored  studies  designed  to  identify 
factors  in  preserv  ,e  and  inservice  teacher  education  proa»-ams  that  would 
have  positive  affects  on  attitudes  toward  both  disabled  children  ana  their 
integration  into  regular  classes. 
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Home  (1985)  reiterated  Donaldson  (1980)  and  Towner's  (1984)  suggestion 
that  in  the  future/  researchers  should  investigate  the  long  term  effects  of 
attitude  change  strategies/  and  she  agreed  with  their  recommendation  that 
research  should  be  directed  toward  exploring  the  relationship  betv^een  scores 
on  attitude  measures  and  behavioral  changes.  Additionally/  Home  argued  that 
the  relationship  between  subject  characteristics  (such  as  predisposition 
toward  disabled  persons/  age,  and  sex)  and  the  effectiveness  of  particular 
interventions  should  be  examined/  along  with  strategie,'.  for  modifying 
attitudes  in  specific  situations  (e.g./  the  attitudes  of  a  particular  group 
toward  a  certain  disability). 

The  focus  of  the  recommendations  for  future  research  was  on  primary 
studies.    No  reviewer  proposed  additional  or  alternative  types  of  reviews. 

Comments 

Concluding  sections  in  nost  reviews  did  not  hold  up  well  when  examined 
against  criteria  for  preparing  conclusions  in  reports  of  primary  studies. 
Less  than  half  of  the  reviewers  referred  specifically  to  their  own  findings 
in  drawing  conclusions.  Most  conclusions  ccncerning  the  effectiveness  of 
particular  intervention  strategies  were  stated  in  broad  generalizations  that 
were  of  doubtful  utility.  Few  reviewers  acknowledged  the  limitations  placed 
on  their  findings  by  poorly  designed  primary  studi  s,  and  recommendations  for 
fuuure  research  were  found  in  only  half  oi.  the  reviews.  Similarly,  the 
reviews  made  little  contribution  to  an  increased  understanding  of  attitude 
change  theory. 

Discussion 

Seven  full  and  eight  brief  reviews  of  primary  research  on  the 
modification  of  attitudes  toward  disabled  persons  were   located  by  a 
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comprehensive  jee.rch  of  the  literature*  These  reviews  were  examined  for 
methodological  soundness  and  for  their  contribution  to  practical  knowledge 
and  attitude  change  theory  using  questions  developed  from  the  work  of  Jackson 
(1978/  1980)  and  others/  with  the  primary  research  process  as  a  model. 

Although  building  on  prior  works  is  a  standard  approach  for  advancing 
knowledge  in  a  field/  most  reviewers  ignored  previous/  but  relevant/  reviews. 
As  a  consequence/  the  reviewers  did  not  draw  on  the  findings  of  earlier 
reviews;  neither  did  they  use  inadequacies  in  prior  reviews  as  a  means  for 
improving  the  quality  of  their  work.  Most  of  the  reviews  were  presented  as 
though  they  were  unique  i.i  the  litec^ture. 

The  possibility  of  sampling  bias  was  presenc  in  each  review,  Metnods  of 
locating  primary  studies  were  seldom  reported;  moreover/  the  limited 
reference  lists  of  studies  and  the  small  number  of  primary  studies  that  were 
cited  in  more  than  one  review  cast  serious  doubt  on  the  repress  tativeness  of  < 
the  samples.  Consequently/  the  generalizability  of  the  findings  of  the 
reviews  was  dubious. 

Many  of  the  primary  studies  reviewed  were  low  in  treatment  and  internal  i 
validity;  and/  although  this  was  mentioned  in  several  reviews/   it  could  not 
be  determined  how  or  if  such  studies  were  weighted  when  decisions  concerning 
the  effectiveness  of  particular  interventions  were  reaches?.     It  seems  ^ 
apparent/  given  that  lack  of  discussion/  that  treatment  and  interna^  -iity 
were  not  explicitly  considered  in  most  reviews.     Including  poorly  igned 
and  executed  studies  in  the  reviews  without  examining  the  association  between  ^ 
design  quality  and  outcomes  compromised  the  integrity  of  interpretations  and 
conclusions, 

A  number  of  significant  methodological  weaknesses  were  found  in  most  of  ^ 
the  reviews.     Primary  studies  were  placed  into  loosely  defined  intervention 
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categories/  with  the  result  that  important  differences  in  sample  and 
intervention  characteristics  were  frequently  disregarded.  Narrative  reports 
of  programs  and  rc- views  of  literature  were  cited  as  though  they  were  primary 
studies.  In  several  reviews/  primary  studies  were  misinterpreted  and 
irrelevant  studies  were  cited.  Furthermore/  there  was  a  general  tendency  to 
report  the  findings  of  complex  primary  studies  in  simple  treatment-outcome 
terms  and/  in  some  cases/  to  reporc  only  partial  results.  Moreover/  even  the 
statistical  significance  of  findings  ;'as  not  presented  in  most  reviews/  and 
none  reported  an  effect  size  metric  independent  of  sample  size.  And/  studies 
which  failed  either  to  identify  the  dependent  variable  or  to  provide 
reliability  or  validity  data  for  project  leveloped  instruments  appeared  to  be 
accepted  uncritically.  Contradictory  findings  were  not  adequately  analyzed 
or  explained  in  most  reviews.  In  view  of  these  concerns/  the  findings  of  the 
reviews  should  be  treated  with  caution. 

Important  deficiencies  were  also  noted  in  the  conclusions  of  most 
reviews.  The  practical  value  of  the  conclusions  is  questionable/  as 
interventions  judged  to  be  effective  were  most  often  described  in  broad 
generalizations.  Few  reviewers  attempted  to  examine  the  theoretical 
underpinnings  of  change  strategies  and  little  contribution  to  attitude  theory 
was  ruade  by  the  reviews.  Additionally/  few  reviewers  acknowledged 
limitations  to  the  generalizabili ty  of  their  findings/  even  though  sample 
bias  was  a  threat  in  all  reviews.  Specific  recommendations  for  future 
research  were  rare. 

Implications  foi  the  Present  Study 
This  review  of  reviews  has  important  implications  for  the  quantitative/ 
integrative  review  reported  on  the  following  pages.    Many  of  the  problems  we 


identified  appear  to  be  common  to  past  narrative  integrative  reviews  of  the 
literature  (Jackson,  1980).  Particularly  germane  to  the  present  study  are 
the  potential  sampling  bias  and  other  methodological  problems  common  to  most 
of  the  15  reviews  examined  above. 

Sampling  bias  was  a  threat  to  the  qeneralizability  of  the  findings  of 
each  review.  The  small  number  of  primary  studies  included  in  the  reviews, 
the  slight  overlap  of  samples  of  primary  studies,  and  the  failure  of 
reviewers  to  describe  how  samples  were  selected  suggest  the  possibility  that 
the  samples  of  studies  for  the  reviews  were  not  representative  of  the 
population  of  primary  works  reporting  investigations  of  interventions  for 
modifying  attitudes  toward  the  disabled.  '?his  finding  indicated  that  a 
careful  sampling  approach  or  a  comprehensive  literature  earch  should  be 
considered  as  an  integral  part  of  additional  reviews  and  that  future 
reviewers  should  report  literature  search  procedures  thoroughly. 

The  ineffectiveness  of  the  reviewers*  practice  of  grouping  interventions 
under  broad  descriptors  in  a  manner  that  failed  to  take  into  account  specific 
sample  and  intervention  characteristics  indicates  the  need  for  more  precise 
coding  and  reporting  of  design  and  sample  characteristics.  Donaldson's 
(1980)  recommendation  that  future  research  be  directed  toward  identifying 
specific  factors  in  university  courses,  integrated  settings,  and  disability 
simulations  that  contribute  to  enhanced  attitudes,  toward  the  disabled  is 
consistent  with  this  recommendation* 

Suggestions  for  sub-categories  within  generic  groupings  of  interventions 
were  for.nd  in  several  of  the  reviews.  Towner  (1984),  for  example,  organized 
and  a.ialyzed  primary  studies  according  to  general  sample  characteristics,  and 
her  description  of  attitude  change  techniques  had  in^plications  for  coding 
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interventions  in  a  more  accurate  manner.  The  conclusion  in  several  reviews 
that  contact  was  an  effective  change  strategy  only  if  certain  conditions  were 
met  (Donaldson/  1980;  Harth,  1973;  Segal,  1978;  Westwood  et  al.,  1981),  and 
that  the  efficacy  of  information  as  a  means  for  modifying  attitudes  was 
related  to  the  way  in  which  it  was  presentv.:'.  (Sandler  &  Robinson,  1981),  also 
implied  subcategories  for  obtaining  detailed  descriptions  of  interventions. 
And,  Anthony's  (1972)  conclusion  that  length  of  treatment  in  an  attitude 
change  study  may  have  a  bearing  upon  the  results  drew  attention  to  the  need 
to  include  this  factor  in  describing  interventions. 

The  failure  of  most  reviewers  to  explore  relationships  between  quality 
of  research  design  and  results  in  the  primary  studies  they  examined  inriplied 
that  data  for  our  review  should  be  collected  and  organized  so  that 
associations  between  design  quality  and  outcomes  could  be  examined. 
Similarly,  the  tendency  in  several  reviews  to  ignore  poorly  designed  and 
inadequately  reported  attitude  assessments  in  primary  studies  suggested  that 
comparing  results  of  studies  differing  in  quality  of  instrumentation  would  be 
an  appropriate  strategy  in  the  present  stucy.  Examining  quality  of  design 
and  Instrumentation  might  provide  insights  into  the  reasons  for  contradictory 
findings.  And/  the  lack  of  attention  in  most  reviews  to  the  theoretical 
underpinnings  of  specific  change  strategies  indicated  that  study-theory 
relationships  should  be  addressed  in  the  present  study.  Most  importantly/ 
the  absence  of  a  comprehensive/  systematic/  integrative  review  of  the 
research  suggc'Sted  the  need  for  such  an  effort  to  determine  whether  the 
generally  indefinite  conclusions  about  the  effectiveness  of  types  of 
interventions  for  modifying  attitudes  accurately  reflect  the  state  of 
available  research  knowledge. 
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CHAPTER  3 

HOW  THE  REVIEW  WAS  CONDUCTED 

The  review  of  literature  has  taken  on  greater  status  as  a  full-fledged 
scholarly  research  activity  in  the  years  since  Glass  (1976)  first  advocated 
the  conducting  of  meta-analyses.  Earlier,  Feldman  (1971)  had  expressed 
concern  about  the  "half-hearted  commitment"  of  behavioral  scieritists  to 
reviewing  and  integrating  the  research  literature.  He  suggested  that  the  lack 
of  more  intense  attention  to  conducting  reviews  "might  account  in  part  for 
the  relatively  unimpressive  degree  of  cumulative  knowledge  in  man/  fields" 
(p.  86).  In  his  landmark  presentation,  Glass  (1976)  pointed  out  chat 
scholarly  values  c^.j  attitudes  which  emphasized  original  research  militated 
against  the  commitment  of  effort  that  is  necessary  to  extract  knowledge  from 
the  "staggering  number  of  individual  studies"  now  available. 

Careful  analyses  of  the  literature,  not  superficial  synopses,  are 
needed.  And,  in  fact.-  Glass  argued,  "a  good  review  is  the  intellectual 
equivalent  of  original  research  .  .  .  ;  we  need  more  scholarly  effort 
concentrated  on  the  problem  of  finding  th^^  knowledge  that  lies  untapped  in 
completed  research  studies"  (p.  4).  Howev??r,  as  Jackson  (1980)  and  Ligb*:  and 
Pillemer  (1984,  pp.  3-4)  have  noted,  the  traditional  narrative  revie  of 
research  has  tended  to  be  subjective,  lacking  in  scientifically  sound 
procedures,  and  inefficient,  especially  when  findings  are  to  be  extracted  and 
summarized  from  a  large  number  of  studies,  for  example,  30  or  more.  If 
reviews  of  literature  aro  to  meet  scholarly  criteria  such  as  are  applied  tc 
original  research,  it  is  crucial  that  a  method  of  reviewing  be  used  that 
overcomes  the  shortcomings  of  the  "traditional  literary  format"~which  often 
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results  in  reviews  that  tell  little  more  than  the  directions  of  findings  and, 
sometimes,  their  probability  level  (Rosenthal,  1984,  p.  10). 

Alternative  Approaches  to  Integrating  Primary  Research 
The  most  commonly  used  approach  to  integrating  research  has  been  the 
narrative  review,  such  as  those  critiqued  in  Chapter  2.  Typically,  such  a 
review  is  based  on  a  small  group  of  easily  obtained  research  reports, 
gathered  from  fairly  prominent  journals  and  Dissertation  Abr.tracts/  using 
criteria  that  are  not  clearly  specified.  The  reviewer  usually  offers  a  brief 
verbal  synopsis  of  each  report;  sometimes  the  methodology  is  critiqued  and 
the  credibility  of  any  conclusions  challenged.  Often  the  reviewer  concludes 
that  the  existing  rese-^rch  is  inconclusive  and  calls  for  additional  research, 
using  better  techniques  and  more  precise  methodology. 

In  a  variation  of  the  narrative  review  approach,  the  reviewer  begins 
with  a  small  group  of  readily  available  articles,  but  eliminates  all  that 
have  striking  design  or  analysis  flaws.  The  findings  from  the  remaining 
"acceptable"  studies  are  summarized  as  the  knowledge  on  the  topic  under 
review.  Unfortunately,  judgments  as  to  what  constitutes  good  research 
frequently  differ  from  reviewer  to  reviewer.  Also,  the  criteria  for 
selecting  "methodologically  superior"  articles  are  often  overly  restrictive, 
with  the  result  that  a  small  and  frequently  unrepresentative  sample  of 
articles  is  considered.  Moreover,  as  Smith  and  Glass  (3  980)  have  pointed 
out,  even  "methodologically  good"  studies  often  result  in  contradictory 
findings,  creating  considerable  difficulty  in  deciding  what  conclusions 
should  be  reached. 

A  more  systematic  approach  to  integrating  the  outcomes  of  primary 
research  is  what  Light  and  Smith  (1971)  and  others  (e.g..  Hedges  &  Olkin, 
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1980)  have  referred  to  as  the  "vote-count  method".  With  this  method,  each 
relationship  between  a  treatment  and  a  dependent  variable  is  tallied  as 
positively  statistically  significant/  negatively  statistically  significant/ 
or  not  statistically  significant.  The  size  of  the  sample  utilized  in  the 
particular  primary  research  study  is  rarely  taken  into  account.  Since  with 
larger  sample  sizes  there  is  a  greater  probability  of  concluding  that  results 
are  statistically  significant/  the  voting  method  discriminates  systematically 
against  studies  with  small  samples.  In  addition/  use  of  the  voting  method 
implies  incorrectly  that  statistical  significance  indicates  the  degree  of 
importance  of  relationships.  Consequently/  erroneous  or  misleading 
conclusions  may  be  drawn.  As  Glass  (1977/  p.  358)  pointed  out,  if  nine 
small-sample  studies  of  a  method  of  modifying  attitudes  yielded  not-quite- 
significant  results  in  one  direction/  while  e  tenth  large  sample  study 
yielded  statistically  significant  results  in  the  same  direction/  the  vote 
would  be  one  for  and  nine  against — a  conclusion  quite  at  odds  with  what  seems 
sensible. 

In  considering  how  to  improve  on  the  voting  method/  Light  and  Smith 
(1971)  concluded  that  .  .  progress  will  only  come  when  we  are  able  to 
pool/  in  a  systematic  manner/  the  original  data  from  the  studies"  (p.  243). 
Unfortunately/  users  of  this  procedure  •must  disregard  any  studies  for  which 
research  data  are  not  obtainable/  and  original  data  from  studies  are  rarely 
easy  to  obtain.  For  example.  Glass  (1977)  reported  that  Wolins  (1962)  wrote 
to  37  authors  asking  for  the  data  from  studies  they  had  published  in  the 
preceding  two  years:  Five  did  not  reply,  21  reported  that  their  data  were 
irretrievable*  ♦iwo  refused  to  shcire  the  results,  and  four  sent  their  data  too 
late  to  be  useful.    Our  efforts  to  obtain  only  the  information  necessary  to 
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compute  effect  sizes  (reported  later  in  this  chapter)  verified  the  likely 
difficulties  in  obtaining  data  for  secondary  analyses. 

The  approach  to  integrating  the  results  of  prior  research  to  be  used  in 
this  project  was  first  proposed  by  Glass  (1976)  as  "meta-analysis".  Properly 
implemented/  the  meta-analysis  approach  meets  all  of  the  criteria  for  high 
quality  integrative  reviews  proposed  by  Jackson  (1980)  and  referred  to  in 
Chapter  2.  Briefly/  in  conducting  a  meta-analysis/  the  reviewer:  (1) 
locates  either  all  studies  or  a  representative  sample  of  all  studies  on  the 
defined  topic;  (2)  converts  the  findings  of  each  study/  regardless  of  scudy 
quality/  to  a  common  metric — that  is,  computes  an  effect  size  for  each 
relevant  finding;  (3)  codes  the  various  characteristics  of  each  study  that 
might  have  affected  the  results  (such  as  type  of  treatment/  methodological 
quality/  sample  attributes/  and  type  of  dependent  measure);  (4)  uses 
statistics  to  summarize  study  outcomes  (effect  sizes)  and  to  examine  the 
covariations  of  outcomes  and  study  characteristics;  and/  (5)  draws 
conclusions  based  on  tiie  results  of  those  analyses. 

Clearly/  meta-analysis  is  not  a  technique/  but  an  approach  to  reviewing 
the  literature — one  might  say,  a  point  of  view  about  reviews — which 
emphasizes  the  gathering  pnd  analysis  of  comprehensive/  systematic/ 
quantitative  data  from  primary  research  reports.  In  his  critique  of  previous 
efforts  to  integrate  the  findings  of  socic.1  science  research/  Jackson  (1980) 
concluded  that  "the  meta-analytic  approach  is  an  important  contribution  to 
social  science  methodology.  it  is  not  a  panacea/  but  it  will  often  prove  to 
be  quite  valuable  when  .  )plied  and  interpreted  with  care*  (p.  455),  And  Gage 
(1982)/  in  discussing  the  past  and  future  of  educational  research/  referred 
to  meta-analysis  as  one  of  the  more  important  methodological  advances  in 
recent  years. 
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Some  educational  and  psychological  researchers/  have  raised  questions 
about  the  use  of  the  meta-analysis  approach  (e.g./  Eysenck/  197t;  Gallo/ 
1978/  Mansfield  &  Bussey/  1977;  Shaver/  1979a/  b;  Simpson/  1980;  Slavin/ 
■"984).  Some  have  questioned  the  result^  of  specific  meta-analyses;  others 
have  raised  concerns  about  the  meta-analysis  approach  per  se.  Most  of  these 
criticisms  and  cautions  have  been  responded  to  in  the  literature  (e.g./ 
Glass/  1978/  1980;  Glass  &  Smith/  1978;  Gar.h^irg  et  al./  1984)/  and  will  not 
be  discussed  here. 

A  major  dilemma  is  how  to  capitalize  on  the  advantages  of  the  meta- 
analytic  approach  with  large  numbers  of  primary  research  reports  so  as  to 
avoid  the  pitfalls  of  traditional  narrative  reviews/  but  without  becoming 
"over-quantified"  and  losing  touch  with  the  subtle  variations  in  individual 
studies  (Light  &  Pillemer/  1984/  Slavin/  1986/  Wolf/  1986).  As  Jackson 
(1978)  has  pointed  out,  "there  are  usually  trade-offs  between  the  quantity  of 
data  and  the  quality  of  the  data  or  analyses"  (p.  16).  How  to  consider 
individual  studies  or  discuss  critical  issues  in  reference  to  specific 
studies/  as  has  often  been  done  in  traditional  narrative  reviews/  poses 
serious  dilemmas  for  quantitative  reviewers  faced  with  large  numbers  of 
findings  to  integrate.  Slavin  (198b)  has  proposed/  for  example/  that  the 
reporting  of  individual  studies  should  be  as  specific  and  detailed  as 
possible/  as  one  type  of  effort  to  preserve  a  positive  aspect  of  narrative 
reviews. 

The  most  important  point  that  the  concerns  and  questions  about  the  meta- 
analytic  approach  have  demonstrated  is  that  meta-analysis/  like  specific 
research  procedures  such  as  random  assignment/  is  not  a  fail-safe  approach. 
If  applied  carelessly/  many  problems  will  occur.  However/  as  Rosenthal 
(1984)  noted: 
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The  alternative  to  the  systematic/  explicit,  quantitative  procedures  [of 
meta-analysis]  is  even  less  perfect,  even  more  likely  to  be  applied 
inappropriately,  and  even  more  likely  to  lead  us  to  error.  There  is 
nothing  in  the  set  of  meta-analytic  procedures  that  makes  us  less  able 
to  engage  in  creative  thought.  All  the  thoughtful  and  intuitive 
procedures  of  the  traditional  review  of  the  literature  can  also  be 
employed  in  a  meta-analytic  review.  However,  meta-analytic  reviews  go 
beyond  the  traditional  reviews  in  the  degree  to  which  they  are  more 
systematic,  more  explicit,  more  exhaustive,  and  more  quantitative. 
Because  of  these  features,  met-^-analytic  reviews  are  more  likely  to  lead 
to  summary  statements  of  greater  thoroughness,  greater  precision,  and 
greater  intersubjectivity  or  objectivity,  (p.  17) 

It  is  important  to  note  that  the  term  "meta-analysis",  first  proposed  by 
Glass  (1976)  to  refer  to  "the  statistical  analysis  of  a  large  collection  of 
analysis  results  from  individual  studies  for  the  purpose  of  integrating  the 
findings"  (p.  3),  is  not  a  unitary  concept,  nor  is  it  without  ambiguity  in 
use.  Other  researchers  (e.g..  Hedges  &  Olkin,  1985;  Hunter,  Schmidt,  & 
Jackson,  1982;  Rosenthal,  1984)  were  involved  in  the  parallel  development  of 
quantitative  review  methods  that  are  forms  of  meta-analysis  as  defined 
immediately  above,  but  that  do  not  necessarily  fit  exactly  the  meta-analytic 
steps  defined  earlier  (p»  66).  Bangert-Drowns  (1986)  has  presented  an 
excellent  discussion  of  five  forms  of  meta-analysis,  including  Glass's 
approach,  other  approaches  developed  parallel  in  time,  and  elaborations  of 
Glass's  approach. 

One  variation  of  Glass's  approach  utilizes  the  study  rather  than  the 
study  finding  as  the  unit  of  analysis.  Explorir*}  variabl-s  that  covary  with 
outcomes  is  still  h  central  concern,  but  poor  quality  studies  or  studies  in 
which  the  treatments  do  not  meet  clearly  defined  criteria  may  be  excluded 
from  the  sample.  Bangert-Downs  (1986)  refers  cc  this  approach  "study 
effect  meta-analysis"  (p.  393),  as  contrasted  with  "Glassian  meta-analysis" 
(p-  391). 
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In  another  approach/  the  "combined  probability  method"  (Bangert-Drowns/ 
1986,  p.  394),  the  analyst  also  uses  the  study  at.  the  unit  of  analysis.  The 
focus,  however,  is  on  producing  an  average  effect  size  and  a  combined 
probability  statement  for  all  of  the  studies  reviewed;  less  attention  is  paid 
to  exploring  outcome  variation. 

A  stronger  inferential  statistics  orientation  is  represented  by  the 
"approximate  data  pooling  with  tests  of  homogeneity"  approach.  Here,  the 
intent  is  "to  approximate  uie  pooling  of  the  subjects  from  all  of  the  studies 
into  one  large  comparison"  ( Bangert-Drowns,  1986,  p.  394).  As  part  of  that 
process,  tests  for  the  hoiuogeneity  of  effect  sizes  and  adjustments  of  effect 
sizes  are  employed. 

Finally,  a  variation  of  the  last  approach,  also  aimed  at  determining  a 
general  estimate  of  treatment  effect,  was  labeled  by  Bangert-Drowns  (1986)  as 
"approximate  data  pooling  with  sampling  error  correction"  (p.  395).  With 
this  study-effect-size  method,  the  variance  in  effect  sizes  due  to  sampling 
error  variability  is  estimated  and  subtiracted  from  the  total  variation  to 
determine  if  the  remaining  variance  is  large  enough  to  justify  the 
investigation  of  moderator  variables. 

A  sixth  variation  of  meta-analysis,  not  mentioned  by  Bangert-Drowns,  is 
Slavin's  (1986)  "best  evidence  synthesis",  a  combination  of  elements  of 
Glassian  meta-analysis  and  traditional  narrative  reviews.  With  this 
approach — which  is,  like  Glass's,  aimed  at  making  statements  about  treatment 
effects,  not  at  estimating  a  pooled  sample  result — the  analysis  of  study 
effect  sizes  from  the  available  reports  chat  are  highest  in  internal  and 
external  validity  is  to  be  coupled  with  tables  that  descrioe  study 
c  har ac  t e  r  i  s  t  ic  s . 
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As  Bangert-Drowns  (1986)  has  noted,  the  choice  of  a  quantitative 
approach  for  conducting  an  integrative  review  should  be  cased  on  the  purpose 
for  the  review.  Is  the  intent  to  try  to  determine  what  the  available 
research  has  to  say  about  the  effectiveness  of  a  treatment  or  treatments,  or 
is  it  to  approximate  a  pooled  sample  in  order  to  arrive  at  a  "general iz able 
estimate  of  a  treatment  effect"?  Our  purpose  was  clearly  the  former.  For 
that  reason/  we  have  adopted  an  approach— which  we  prefer  to  call 
quantitative-integrative/  rather  than  meta-analytic — which  draws  primarily  on 
Glciss's  (19V6/  1977)  conception  of  meta-analysis  and  recognizes  Slavin's 
(1986)  advice  to  provide  the  reader  with  information  on  primary  studies  to 
the  extent  possible  within  the  constraints  of  reviewing  large  bodies  of 
research. 

Procedures  for  this  Review 

As  others  (e.g./  Jackson/  1978/  p.  7;  Cooper/  1932)  have  noted/  the 
tasks  involved  in  doing  an  integrative  review  of  literature  are  basically  the 
same  as  for  primary  research.  To  lay  the  basis  for  the  study/  a  topic  must 
be  selected/  a  problem  specified  (as  has  been  done  m  Chap^''r  1)/  and  a 
critical  review  of  prior  related  work  completed  (Chapter  2).  Then,  the  study 
must  be  conducted:  subjects  (in  a  review/  research  reports)  identified  and 
selected/  instrumentation  selected  or  developed/  data  gathered  and  analyz^^d^ 
and  results  and  int-  ^pretations  reported. 

Interestingly/  .ne  one  missing  element  in  planning  a  review/  as  compared 
to  primary  research/  is  the  design  of  the  study  to  control  for  various 
threats  to  internal  validity.  Reviews  of  literature  are  by  definition/  fx^st 
hoc  efforts  to  draw  conclusions  from  data  already  gathered  under  research 
conditions  beyond  the  control  of  the  invt*stigator  (reviewer)^    Some  {e.q.r 


Shaver/  1979a,  b)  have  argued  that  post  hoc  analyses  (integrative  reviews)  of 
sets  of  primary  studies  that  often  were  carried  out  without  planning  based  on 
the  careful  analysis  of  prior  primary  research  are  not  likely  to  be  any  more 
productive  than  the  post  hoc  analyses  of  primary  data  about  which  researchers 
have  been  skeptical  because  of  the  difficulty/  if  not  impossibility/  of 
cc.  ^.rolling  for  extraneous  factors.  Whether  that  skepticism  is  valid  for 
quantitative  integrative  reviews  is  still  open  to  question. 

In  any  event/  the  description  of  procedures  for  this  comprehensive 
integrative  review  of  the  research  on  modifying  attitudes  toward  disabled 
persons  follows  the  traditional  format/  sans  Design  section.  With  the 
problem  statement  and  review  of  literature  presented  in  earlier  chapters/  the 
rest  of  this  chapter  addressee  the  identification  and  selection  of  studies/ 
instrumentation/  data  collection/  and  analysis. 

Accessible  Population 

The  purpose  of  this  study  was  to  conduct  a  comprehensive  integrative 
review  of  the  literature.  That  is,  the  target  population  was  all  English- 
language  reports  of  research  identifiable  through  an  extensive  search 
strategy  conducted  in  this  country  and  Canada.  The  intent  was  to  obtain  all 
such  reports/  to  the  extent  possible.  There  was,  therefore/  no  sampling 
procedure;  ana,  as  will  be  reported  later/  only  a  few  o  the  identified 
reports  could  not  be  obtained/  although  some  that  were  relevant  had  to  be 
discarded  because  adequate  information  was  not  reported. 

Because  of  the  inclusivenesi.  of  study  identification/  although  certainly 
not  perfect/  we  consider  the  set  of  primary  research  reports  we  reviewed  to 
be  an  accessible  population/  not  a  sample.  As  Jackson  (1980/  p.  453)  has 
pointed  out,  the  decision  dS  to  whether  a  set  of  studies  is  to  be  considered 
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a  population  or  a  sample  is  a  serious  one,  with  implications  for  whether 

inferential  statistics  should  be  used  to  analyze  the  data  gathered  from  the 

reports  (discussed  in  the  Analysis  section  of  this  chapter).     The  matter  is 

complicated  by  the  tendency  of  reviewers  to  want  to  draw  concluoions  about 

populations  of  phenomena  (e.g.,   attitude  change  treatments)  not  about  a 

population  of  research  reports.  That  distinction  is  a  fine  line  to  draw  and 

walk,  but  we  shall  try  to  do  so  and  be  circumspect  in  this  report. 

It  should  also  be  recognized,    as  rioted  above,    that   including  all 

obtainable  research  reports  in  the  review  is  not  a  universally  accepted 

strategy  (Bangert-Drowns,  1986).    Some  critics  of  Glass's  .^ta-analytic 

approach  (e.g.,    Eysenck,    1978)   have  argued  that  only  methodologically 

superior  studies  should  be  included  in  integrative  reviews,  on  the  ground 

that  poorly  designed  studies  cannot  yield  valid  and,   therefore,  uscr.ul 

information.    The  counter-argument  is  that  such  a  restriction  may  frequently 

eiiminace  studies  frOu^  which  important  information  can  be  gained.     As  Glass 

(1977)  has  noted,  researchers  do  not  typically  set  out  to  perform  studies 

that  are  deficient;  and,  once  a  less  than  perfect  study  has  been  done,  its 

findings  should  not  be  disregarded  totally.     He  argued: 

Many  weak  studies  can  add  up  to  a  strong  conclusion.  Suppose  that  in  a 
group  of  one  hundred  studies,  studies  1  to  10  are  weak  in  representative 
sampling  but  strong  in  other  respects,  studies  11  to  20  are  weak  in 
measurement  but  otherwise  strong,  studies  21  to  30  are  weak  in  internal 
validity  only,  studies  31  to  40  are  weak  only  in  data  analysis,  etc. 
But  imagine  also  that  all  100  studies  are  somewhat  similar  in  that  they 
show  a  superiority  of  the  experimental  over  the  control  group.  The 
critic  who  maintains  that  the  total  collection  of  studies  does  not 
support  strongly  that  conclusion  of  treatment  efficacy  is  forced  cc 
invoke  an  explanation  of  multiple  causality  (i.e.,,  the  observed 
difference  can  be  caused  either  by  this  particular  measurement  flaw  ^  r 
that  particular  design  flaw  or  this  particular  analysis  flaw  or  .  .  .). 
The  number  of  multiple  causes  which  must  be  invoked  to  counter  the 
explanation  of  treatment  efficacy  can  be  embarrassingly  large  for  even  a 
few  dozen  studies.  Indeed,  the  multiple  defects  explanation  will  soon 
grow  into  a  conspiracy  theory  or  else  collapse  under  its  own  weights 
Renpect  for  parsimony  and  good  sense  demands  an  acceptance  of  a  notion 
that  imperfect  stuclies     -i  converge  on  a  true  conclusion,    (p.  356). 
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Of  course/  it  is  possible  that  methodologically  weak  studies  will  yield 
biased  or  misleading  results.  For  example,  in  a  meta-analysis  of  research  on 
the  treatment  of  hyperactivity  (White,  Myette,  Baer,  &  Taylor,  1982),  an 
empirical  aj^roach  was  taken  to  determining  whether  including  "weak"  studies 
biased  the  results.  Each  study  was  classified  according  to  well-defined 
criteria  for  methodological  gcality  (e.g.,  type  of  control  group,  reliability 
or  fakability  of  outcome  measures,  "blinding"  of  judges,  duration  of 
intervention),  and  associations  between  design  strength  and  outcomes  were 
analyzed.  "Weaker"  studies  did  yield  different  results  than  "stronger" 
studies.  That  is,  when  all  of  the  studies  in  the  set  were  considered,  drugs 
appeared  to  be  more  effective  in  reducing  the  symptoms  of  hyperactivity  than 
when  the  analysis  was  limited  to  those  studies  in  which  control  groups  were 
used,  which  met  minimum  standards  of  internal  validity,  and  in  which 
objective  measures  were  used  to  select  hyper^^rlve  children  £or  the  study  and 
to  measure  outcomes.  Consequently,  more  credence  was  placed  in  the  results 
of  the  "stronger"  studies*  However,  had  the  results  been  similar  regardless 
of  study  quality,  incliiding  all  of  the  studies  would  have  allowed  the  more 
complete  exarination  of  data  to  answer  other  important  questions  (e.g.,  the 
influence  of  age  of  child  or  duration  of  treatment). 

In  summary,  for  the  review  reported  here,  a  comprehensive  approach  was 
taken  to  identifying  and  securing  research  reports.  As  will  be  reported,  the 
covariation  of  results  with  methodological  adequacy  was  examined  and 
conclusions  drawn  accordingly.  Use  of  chis  approach  does  not  condone  future 
primary  research  studies  with  weak  designs,  poor  measurement  techniques,  or 
inappropriate  analyses.  In  fact,  the  results  reported  in  later  chapters 
provide  strong  evidence  for  the  importance  of  careful  study  design. 
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Definition  of  scopes  Having  stated  that  the  target  population  is  all 
English-language  reports  of  research  to  modify  attitudes  toward  disabled 
persons  that  are  identifiable  through  the  usual  search  mechanisms  available 
in  this  country/  it  was  still  necessary  to  provide  more  specific  guidelines 
by  which  those  searching  for  reports  could  discern  relevance  so  that 
decisions  as  to  what  reports  to  include  would  be  made  consistently*  A 
statement  consistent  with  Chapter  1  was  written  for  the  guidance  of  project 
staff  (see  STATEMENT  OF  GENEE^  PURPOSE  AND  POPULATIONS  in  Appendix  A).  When 
a  person  was  not  certain  whether  to  include  or  exclude  a  report/  the  opinion 
of  another  staff  member  was  obtained.  In  cases  where  che  project  director 
was  not  one  of  the  first  two  conferees  and  doubt  remained/  the  report  was 
brought  to  him  for  discussion  and  resolution  of  the  dilemma. 

In  initially  identifying  reports/  the  titles  of  reports/  and  any 
descriptions  of  the  studies  when  the  titles  were  obtained  from  the  reference 
lists  of  other  reports  or  reviews/  were  examined  for  words  that  indicated 
that  the  intent  of  the  research  was  to  investigate  methods  for  modifying 
attitudes  toward  disabled  persons-*  In  screening  actual  reports  for 
inclusion/  only  those  reports  in  which  it  was  clear  that  the  intent  was  to 
investigate  (1)  the  mcaification  of  attitudes  (2)  toward  persons  with 
disabilities  or  handicaps  (whether  specific  disabilities  or  in  general)  were 
included. 

Journal  articles/  master's  theses/  doctoral  dissertations/  and  other 
unpublished  papers  identifiable  through  conventional  computer  and  hand  search 
techniques  and  through  the  bibliographies  of  other  reports  were  eligible  for 
inclusion.  Because  the  emphasis  was  on  reports  in  English  that  were 
accessible  in  this  country/  most  of  the  identified  studies  he-d  been  conducted 
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in  North  America,  particularly  in  the  USA;  but  research  from  other  countries 
was  not  excluded.* 

Of  specific  interest;  then;  were  empirical  investigations  of  the  effects 
of  interventions;  or  treatments;  on  the  attitudes  of  nondisabled  persons 
toward  persons  with  disabilities.  Correlational  research  was  excluded.  For 
example;  contact  with  disabled  persons  is  one  common  t^^  of  intervention. 
However;  if  a  researcher  went  into  a  school  district;  obtained  information 
from  students  about  the  amount  of  their  prior  contact  with  disabled  p>ersons; 
and  then  compared  mean  attitude  scores  for  reported  contact-hour  groups  or 
correlated  amount  of  reported  contact  with  scores  on  an  attitude  toward 
disabled  persons  measure;  that  study  did  not  rjualify.  In  addition  to  studies 
with  experimental  and  quasi-experimental  designs;  single-group  studies  that 
involved  a  planned  intervention  and  the  collection  of  pretest  and  posttest 
data  were  included;  as  well  as  static  group  designs  used  to  investigate 
program  effects.  Mainstreaming  studies  were  included  if  the  effects  on 
attitudes  toward  disabled  persons  had  been  investigated. 

Research  reports  with  any  age  or  occupational  samples  were  of  interest; 
as  long  as  the  research  was  directed  toward  changing  attitudes  toward 
persons  vdth  disabilities  or  handicaps. 

"Disabled  or  handicapped  persons"  was  defined  in  terms  of  conventional 
special  education  categories;  as  reflected  in  Public  Law  94-142;  to  include: 
mentally  retarded;  hard  of  hearing;  deaf;  speech  impaired;  visually 
handicapped;    seriously  emotionally  disturbed   (or;    mentally  ill); 


*0f  705  effect  sizes  from  our  populate  jn  of  studies;  653;  or  93  percent;  were 
based  on  U.S.A.  samples  of  subjects.  Of  the  remaining  52  effect  sizes;  15 
(2%)  came  from  Canadian  samples;  8  (1%)  for  Australian  or  New  Zealand 
samples;  and  29  (4%)  from  5  other  countries  (Ghana;  Israel;  Jamaica; 
Thailand;   and  Turkey). 
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orthopedically  iir.paired/  deaf-blind/  multi-handicapped/  and  learning 
disabled/  as  well  as  general  categories  such  as  "the  disabled"/  "the 
handicapped"/  or  "physically  disabled".  Studies  of  subjects  from  populations 
such  as  "disadvantaged  students"/  "disruptive  students"/  or  "slow  learners" 
were  not  included. 

Attitudes  toward  disabled  or  handicapped  persons  was  the  dependent 
variable  of  interest  in  identifying  and  selecting  primary-  reports.  It  was 
recognized  that/  consistent  with  common  definitions  (e.g./  TriandaS/ 
Adamopoulas/  &  Brinberg/  1984)/  researchers  might  consider  attitudes  (v/hich 
we  defined/  to  provide  context/  as  "interrelated  beliefs  about  and  feelings 
toward  an  object  which  predispose  the  person  to  act  in  certain  ways")  as 
having  cognitive/  affective/  and/or  behavioral  components.  It  was  also 
recognized  that  "attitudes"  might  be  assessed  in  a  variety  of  ways/  including 
paper-and-pencil  tests  with  items  that  are  cognitive-affective  mixtures/ 
assessments  of  changes  in  voluntary  interactions  with  disabled  persons/  or 
reactions  on  projective- type  tests.  Measures  which  assessed  only  knowledge 
about  the  disabled  did  not  qualify  for  selection/  unless  clearly  considered 
by  the  research  report  author(s)  to  be  attitude  assessinents;  nor  did  measures 
which  assessed  attitudes  toward  mainstreaming  qualify.  General  measures  of 
attitudes  toward  children  or  other  people  were  not  included/  unless 
specifically  aimed  at  disabled  persons  or  a  particular  type  of  disability/ 
through  instructions  to  the  Ss  or  because  of  the  context  of  the  study — e.g./ 
an  attempt  to  change  parents'  attitudes  toward  their  disabled  children. 

Measures  such  as  sociometric  scales/  friendship  choices/  or  observations 
of  interactions  were  considered  relevant  only  if  clearly  considered  by  the 
researcher(s)  to  be  assessments  of  attitudes.    Even  if  considered  in  the 
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report  to  be  attitude  assessments,  observational  or  other  data  were  not 
included  if  the  behaviors  or  responses  of  nondisabled  Ss  toward  disabled 
persons,  or  the  direction  of  behavioral  or  response  change,  could  not  be 
identified. 

The  intent  in  conducting  a  meta-analysis  type  of  study  is  co  obtain 
effect  sizes  to  be  analyzed.  Typically,  studies  from  which  effect  sizes 
cannot  be  obtained  are  discarded.  In  a  major  departure  from  that  practice, 
we  also  selected  studies  if  it  could  be  determined  that  a  result  was 
statistically  significant  and  the  study  otherwise  met  the  guidelines.  This 
methodological  decision  was  consistent  with  a  later  recommendation  by  Slavin 
(1986).  Our  purpose  in  including  such  studies  in  our  population  was  to 
investigate  whether  the  statistical  significance  and  direction  of  findings 
differed  between  studies  for  which  effect  size  information  was  available  and 
those  for  which  it  was  not. 

The  search.  An  effort  was  made  Lo  identify  and  obtain  every  report  of 
research  in  the  target  population,  as  defined  above.  In  the  quest  for 
research  reports,  the  search  of  the  literature  began  with  a  computer  search 
that  included  ERIC,  CEC  Abstracts,  Dissertation  Abstracts,  Internritional 
Index  Medicus,  Psychological  Abstracts,  and  Social  Science  Research,  using 
the  descriptor  "attitude  change"  with  the  broad  descriptor,  "disabilities", 
as  well  as  with  descriptors  specific  to  types  of  disabilities  such  as  "mental 
retardation"  and  "deaf".  The  computer  search  was  updated  twice  during  the 
duration  of  the  project.  Hand  searches  of  Psychological  Abstracts,  Education 
Index,  and  Dissertation  Abstracts  International  were  also  done.  Also,  the 
references  in  Attitudes  and  Disability:  An  Annotated  Bibliography,  1975-1981 
(Regional  Rehabilitation  Research  Institute  on  Attitudinal,   Legal,  and 


Leisure  Barriers/  George  Washington  University)  were  checked.  In  addition/ 
the  reference  lists  in  all  of  the  reviews  cited  in  Chapter  2  were  searched/ 
as  was  the  reference  list  in  each  primary  research  report  we  obtained/ 
whether  or  not  it  was  decided  to  include  the  report  in  our  review. 

Data  were  not  kept  on  the  yield  of  our  various  sources/  but  it  was  clear 
that  the  computer  search  yielded  the  smallest  number  of  reports  and  the 
reference  lists  of  primary  research  reports  the  largest  number.  This  search 
outcome  is  consistent  with  the  experience  of  the  staff  of  Utah  ftate 
University's  Early  Intervention  Research  Institute/  who  have  conducted  major 
meta-analyses  of  early  intervention  research.  From  10  to  30%  of  the 
available  primary  research  reports  are  likely  to  be  identified  via  computer 
searches/  with  the  majority  coming  from  the  reference  lists  of  other  reviews 
and  primary  research  reports.  The  shortcomings  of  our  computerized  searches/ 
as  helpful  as  they  were,  also  reflected  our  prior  experience  with  searches 
that  did  not  yield  relevant  reports  that  we  knew  were  in  the  data  base 
because/  for  example/  one  of  us  had  authored  them.  And/  it  is  consistent 
with  the  lament  by  Bracey  in  his  Research  column  in  the  January/  1985  Kappan 
that  a  colleague  who  did  an  ERIC  search  of  research  on  reading  comprehension 
"could  find  only  46%  of  the  articles  he  knew  to  exist"  (italics  in  the 
original/  p.  395).  Clearly/  a  search  strategy  in  which  computerized  indexes 
are  the  only  source/  or  even  one  in  which  only  computerized  and  print  indexes 
are  used,  is  not  likely  to  meet  the  ideal  of  findinc,  all  available  studies  in 
the  defik.ed  area  of  interests  This  caveat  is  particularly  noteworthy  in  a 
time  when  some  reviewers  assume  that  a  computer  search  is  adequate  to  the 
task. 

Copies  of  some  667  primary  researcii  neports  that  were  judged  potentially 
relevant  based  on  title  and  abstract  or  reference  in  a  review  or  primary 
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research  report  were  obtained  through  a  variety  of  sources.  The  journal  and 
the  ERIC  microfiche  collections  in  the  Utah  State  University,  University  of 
British  Columbia/  Simon  Fraser,  and  Western  Washington  University  libraries 
were  utilized.  In  addition/  the  Interlibrary  Loan  Department  of  the  Utah 
State  University  library  sent  out  218  requests  for  reports  for  us,  of  which 
187  (86%)  were  received.  Included  were  77  dissertations,  many  of  which  had 
been  identified  in  Dissertation  Abstracts  International,  (No  dissertation 
abstracts  were  included  in  the  review  because  of  the  limited  amount  of 
information  they  contain.)  In  addition/  hard  copies  of  154  dissertations  not 
available  through  Interlibrary  Loan  or  from  the  authors  were  purchased  from 
University  Microfilms,  Inc.  (That  represented  a  considerable  expense  at  $25 
plus  $2.25  shipping  per  dissertation;  UMI  personnel  were  very  cooperative.) 

Each  of  the  667  primary  research  reports  obtained  was  screened  and  273 
were  judged  to  be  relevant  to  the  review  topic.  Bullock  and  Svyantek  (1985) 
have  underscored  the  importance  of  providing  a  listing  of  the  studies  which 
constituted  the  data  set  for  a  quantitative  review.  It  is  equally  important 
to  report  those  studies  explicitly  excluded  from  the  analysis  so  that  readers 
can  comprehend  the  scope  of  the  data  set  and  judge  its  adequacy  as  a 
population  or  sample  of  studies.  The  363  reports  that  were  discarded  as 
irrelevant  for  our  analysis  are  listed  with  brief  identifying  information  in 
Appendix  D/  as  are  the  31  discarded  for  lack  of  information. 

Some  comments  on  specific  discards  may  help  to  make  clear  how  the 
criteria  for  inclusion  were  applied.  Two  of  the  discarded  reports  dealt  with 
attitude  change/  but  not  within  the  context  of  our  review — which  was  a 
concern  with  how  to  make  attitudes  toward  disabled  persons  more  positive  in 
order  to  enhance  equality  of  educational,    social,   and  economic  opportunity. 
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In  his  master's  thesis,  English  (1966)  attempted  to  make  more  negative  the 
attitudes  toward  blind  persons  of  57  undergraduate  students  (as  compared  to 
the  attitudes  of  59  control  Ss)  in  two  introductory  sociology  classes.  The 
Ss  listened  to  3  ten^minute  tape  recordings  represented  to  be:  (1)  comments 
by  a  sighted  student,  disgusted  at  being  exploited  by  her  blind  roommate;  (2) 
a  homosexual  person  who  justified  homosexuality  based  on  his  blindness;  and, 
(3)  a  blind  person  who  expressed  feelings  of  self-pity  and  resentment  of 
sighted  people.  Attitude  change  in  a  negative  direction  was  predicted  for 
the  experimental  group.  Analysis  of  pre-posttest  data  from  the  Attitude 
Toward  Blindness  Scale  indicated  a  lack  of  statistical  significance,  although 
our  computation  of  a  £  (based  on  pre-posttest  mean  gains  and  using  as  the 
standard  deviation  the  pretest  standard  deviation  for  the  experimental  group 
pooled  with  the  pretest  and  posttest  standard  deviations  for  the  control 
group)  yielded  an  effect  size  of  -.19. 

England's  intent  to  produce  more  negative  attitudes  not  only  ran  counter 
to  our  concerns  about  modifying  attitudes,  but  raised  a  serious  and  obvious 
ethical  question,  which  was  not  addressed  in  the  thesis.  We  continue  to  be 
perplexed  at  the  justification  for  the  research,  given  no  stated  practical  or 
theoretical  reason  to  attempt  to  make  college  students'  attitudes  toward  a 
group  of  persons  with  a  disability  more  negative.  England's  methodology  and 
silence  on  the  ethical  issue  are,  for  example,  in  stark  contrast  to  the 
decision  by  Oberle  (1975)  not  to  include  a  "negative  contact  condition'^  e^^n 
though  it  might  have  added  to  the  understanding  of  attitudes  toward  disabled 
job  applicants,  because  to  do  so  would  be  "imcompatible  [sic]  with  the  ethics 
of  experimentation  with  human  subjects  and  the  overall  philosophy  of 
rehabilitation"  (p.  108). 
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Another  discarded  article  resembled  the  English  (1966)  thesis,  in  that 
attitudes  toward  a  Down  syndrome  child  were  expected  to  become  more  negative 
as  a  result  of  group  discussion  of  feelings  about  the  target  child 
(Siperstein/  Bak,  &  Gottlieb,  1977),  However,  this  scudy  was  justified  on 
the  grounds  of  better  understanding  what  might  happen  as  more  handicapped 
children  are  integrated  in  regular  classrooms,  Also/  ratner  than  an  effort 
to  change  attitudes  in  a  negative  direction  through  use  of  materials 
presenting  negative  stereotypes  of  persons  with  a  disability,  the  Siperstein 
et  al*  study  was  an  investigation  of  the  effects  of  discussion  of  a  situation 
in  which  a  Down  syndrome  child  appeared  as  acadeaical"'y  incompetent.  In  any 
event,  the  intent  of  the  study  was  not  to  investigate  whether  an  intervention 
would  improve  attitudes  toward  disabled  persons,  and  it  ;as  discarded. 

A  further  illustration  of  the  types  of  decisions  made  in  discarding 
studies  involves  an  often  cited  study  by  Gottlieb  (1972)  which  was  not 
included  in  our  review.  In  that  research,  children  chose  whether  to  play  a 
ring-toss  game  with  a  mentally  retarded  child  under  two  conditions — 
expectation  of  success  or  lack  of  success  (a  ring  toss  from  3  or  12  feet;  and 
level  of  reinforcement  (5  cents  or  50  cents  for  winning).  Volunteering  was 
confounied  with  the  expectation  and  reinforcement  variables;  and,  the  study, 
justifiably,  got  at  variables  related  to  attitude  change  with  voluntary 
contact  rather  than  at  the  effects  on  attitudes  of  manipulating  contact  and 
of  manipulated  contact  plus  reinforcement  (which  was  not  u-^anipulated  vis-a- 
vis contact,  but  was  a  function  of  ring-toss  performance). 

As  is  typical  in  using  the  meua-analysis  approach,  a  central  interest 
vas  in  obtaining  effect  sizes  to  use  in  quantitative  analyses  of  the 
literature.    The  procedures  for  computing  effect  sizes  wil]  be  described  in 
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the  next  section.  It  is,  however/  pertinent  to  a  discussion  of  the  nurL>er  of 
articles  obtained  in  the  search  to  note  that  we  wrote  to  authors  for 
information  when  the  description  in  their  report  was  inadequate  for  effect 
size  computations. 

Addresses  were  sought  in  the  1986  National  Faculty  Directory /  the 
American  Psychological  Association's  1984  Membership  Register/  and  the 
American  Educational  Research  Association's  1985-86  Biographical  Membership 
Directory.  If  an  author's  address  was  not  available  in  any  of  those  sources, 
a  letter  was  sent  to  the  address  on  the  article  or,  in  a  few  cases,  to  the 
major  professor  for  a  dissertation.  Multiple  letters  were  sent  for  a  report 
when  there  were  multiple  authors  or  when  a  follow-up  seemed  in  order,  e.g., 
to  request  additional  information. 

One  hundred  and  forty-six  letters  were  sent  to  authors  to  request 
information  for  117  reports.  The  results  are  presented  in  Table  5.  In  the 
case  of  53  studies  (45%),  nothing  was  heard.  (For  74,  or  51%,  of  the  146 
letters  sent,  there  were  no  responses.)  For  13  reports  (16  letters),  the 
letters  were  returned  by  the  Post  Office  as  undeliverable  or  someone  wrote  to 
say  some  such  thing  as  that  the  author  was  dead  or  had  moved  leaving  no 
forwarding  address;  for  three  reports,  we  were  informed  that  the  person  to 
whom  we  wrote  was  not  the  author.  For  23  reports  (20%),  authors  wrote  to 
tell  us  the  information  we  had  requested  wafo  not  available — a  courtesy  we 
very  much  appreciated,  and  which  was  acknowledged  (as  were  all  letters 
received)  with  a  short  thank-you  letter.  For  14  reports,  information  was 
sent  that  was  different  from  that  requested  and  did  not  permit  effect  size 
computations.  And,  for  14  reports,  we  received  information  that  allowed  the 
desired  effect  size  computations. 

Our  12%  fruitful  response  rate  for  studies  (10%  of  letters)  is  close  to 
what  has  been  reported  by  others — for  example,   Wolins  (1962),   as  reported  by 
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Table  5 

Responses  to  Information  Requests 


Type  of  Report 


Type  of  Response 


Dissertations 


Project 

Articles         Reports  Papers 


Total 


CX3 


None 

Author  unavailable 

Letter  returned  by  P.O. 

Author  dead/  inoved 

Wrong  person 

Info,  not  available 

Info,  provided 
Incorrect 

Correct 
Total 


25/30 
2/2 

2/1 
1/1 

2/2 
2/2 


34/38 


24/38 

5/8 
3/3 
/I 
22/ 

9/11 
9/9 


72/93 


3/4 


/I 

3/3 
1/1 


7/9 


1/2  53(45%)/74(51%) 

7(5%)/10(7%) 
3(3%)/3(2%) 
1/1  3(3%)/3(2%) 

23(20%)/25(17=6) 

14(12%)/16(11%) 
2/3  14(i2%)/15(10%) 


4/6 


117/146 


Note;  In  each  cell,  nuinbei:  (or  percent)  of  reports  for  which  informaticn  was  requested  i-  to  the  left 
of  the  slash;  the  number  (or  perc  nt)  of  letters  sent  to  obtain  information  is  to  the  right  of 
the  slash. 
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Glass  (1977),  received  useful  data  from  5  of  37,  or  13,5%,  of  the  authors 
queried;  Hyde  (1982),  as  reported  by  Orwin  and  Cordray  (1985),  got 
information  from  2  of  18  authors,  or  11%,  Yet,  it  was  disappointing  to  find 
again  how  rarely  data  could  be  obtained  from  authors.  It  was  particularly 
disappointing  to  receive  no  response  at  all  to  51  percent  of  our  information 
request  letters.  Despite  the  return  of  nine  percent  of  the  letters  by  the 
Post  Office  or  someone  at  the  receiving  institution,  it  is  likely  that  some 
of  the  74  letters  for  which  we  hac  no  response  went  astray.  There  apparently 
were,  however,  a  fairly  sizable  number  of  authors  who  simply  chose  not  to 
respond,  not  a  very  collegial  reaction. 

Although  the  information  requested  was  obtained  for  only  14  of  1*l7 
reports,  all  of  the  remaining  103  reports  weie  not  discarded.  In  some  cases, 
information  was  available  to  compute  one  or  more  effect  sizes  other  than  that 
for  which  additional  information  was  requested.  Also/  as  noted  above, 
contrary  to  what  is  the  case  in  most  meta-analytic  reviews,  it  was  decided 
that  valuable  information  would  be  gained  by  coll'^^ting  data  from  reports  for 
which  effect  sizes  could  not  be  computed  but  for  which  it  could  be  determined 
whether  results  were  statistically  significant  at  the  .05  level.  As  a 
result,  only  31  reports  were  discarded  for  lack  of  information.  They  are 
listed  in  Appendix  D. 

The  remaining  273  reports  were  the  accessible  population  for  the 
integrative  review.  They  are  listed  in  Appendix  E,  and  a  brief  description 
of  each  study  is  presented  in  Appendix  F. 

Instrumentation  and  Data  Collection 

The  basic  meta-analytic  approach  involves  quantifying  the  outcomes  of 
primary  research  studies  using  a  common  metric  and  coding  various  study 
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characteristics  so  that  it  can  be  determined  whether  outcomes  covary  with  the 
treatment  variable  and  with  any  other  study  characteristics.  The 
cl^soifica;:ion  system  used  to  code  primary  studies  is,  therefore/  fundamental 
to  data  collection  and  data  analysis.  It  must  be  comprehensive  enough  to 
"capture"  the  factors  which  are  contributing  to  variance  among  studies/  but 
not  be  so  complex  as  to  make  coding  overly  burdensome.  There  are  at  least 
three  other  major  considerations  in  developing  a  coding  instrument:  (1)  That 
the  data  be  coll.>cted  in  a  usable  format;  (2)  that  the  coding  instrument 
adequately  reflect  the  substantive  area  under  review;  and/  (3)  that 
appropriate  nontraatment  study  characteristics  be  coded. 

The  first  consideration  may  seem  mundane/  but  it  is  practically  very 
important/  especially  if  the  meta-analysis  is  of  sufficient  scope  that  a 
mainframe  computer  will  be  used  for  analysis.  In  regard  to  format/  a  coding 
instrument  developed  at  Utah  State  University's  Early  Intervention  Research 
Institute  for  a  meta-analysis  of  early  intervention  research  with  at-risk 
children  (Vvhite  &  CastO/  1985)  was  of  great  value/  as  was  consultation  with 
the  principle  investigators.  The  second  major  consideration/  that  the  coding 
instrument  reflect  adequately  the  substantive  <urea  under  review  (in  this 
case,  efforts  to  modify  attitudes  toward  disabled  persons)/  raised  the 
concern  that  we  would  be  able  to  code  corre'ttly  the  attitude-change 
interventions  reported  in  the  literature.  Our  prior  review  of  research/ 
reported  in  Chapter  2,  helped  to  ensure  that/  as  did  the  prior  reading  of  a 
number  of  the  primary  research  reports  and  tryouts  of  the  instrument  on 
research  reports  as  it  was  developed.  The  third  major  consideration/  that 
the  coding  instrument  contain  appropriate  categories  for  identifying  study 
characteristics  that  might  be  expected  to  covary  with  and  perhaps  confound 
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treatment  effects/  was  addressed  based  on  the  literature  on  research  design 
(e.g./  Campbell  &  Stanley/  1963;  Cook  &  Campbell/  1979)  and  meta-analysis. 

The  result  of  our  instrument  development  was  a  20-page  coding  instrument 
with  some  162  categories  (see  Appendix  B),  including  a  2-page  "prior  contact" 
supplement  with  14  categories  and  a  2-page  "contact"  supplement  with  18 
categories.  Development  of  the  main  instrument  took  place  over  a  3-month 
peric^T  revisions  continued  until  the  scoring  of  new  reports  could  be 
accomplished  reliably/  with  no  distortion  of  studies  to  fit  the  categories/ 
and  no  important  information  being  left  out. 

Modifications.  Even  with  the  care  in  development  and  tryout/  it  proved 
to  be  impossible  to  anticipate  all  important  coding  alternatives  and  some 
changes  were  made  after  scoring  was  in  progress.  For  example/  we  became 
sensitive  to  the  fact/  not  commented  on  in  other  reports  of  meta-analyses  we 
had  read/  that  a  positive  effect  size  might  actually  reflect  a  negative 
treatment  effect.  That  is,  a  treatment  group  could  have  a  decline  in  scores/ 
indicating  more  negative  attitudes/  from  precesc  to  posttest/  but  because  the 
control  group's  scores  dropped  even  more/  the  effect  size  would  bo  positive. 
That  offended  our  sense  of  what  the  researchers  were  attempting  to  do  and  of 
what  we  wanted  a  positive  effect  size  to  indicate  in  our  review. 
Consequent ''y/  we  added  a  category  to  identify  studies  with  positive  effect 
sizes/  but  with  no  or  negative  mean  attitude  changes  for  a  treatment  group. 

Another  example  of  "in  process"  revisions  had  to  do  with  "prior 
contact".  After  coding  a  number  of  studies/  it  seemed  evident  that  we  were 
not  collecting  adequate  data  on  that  important  potential  moderating  variable. 
Consequently/  the  Prior  Contact  Coding  Sheet  was  developed.  By  the  same 
token/   a  Contact  Coding  Sheet  was  developed  to  obtain  more  information  than 
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we  originally  anticipated  to  necessary  on  the  extent  to  which  contact 
treatments  reflected  variables  considered  theoretically  important  (see,  e.g., 
Yuker  et  al.,  1970;  Yuker,  1986;  Makas,  1986). 

In  each  instance  of  a  coding  change,  all  studies  already  coded  were 
checked  against  the  revised  category,  and  new  scores  entered  as  necessary. 
This  was  done  for  the  Prior  Contact  and  Contact  Coding  Sheets  after  all  of 
the  reports  had  been  coded  with  the  main  coding  instrument. 

Coding  conventions.  The  coding  instrument,  including  the  Prior  Contact 
and  Contact  Coding  Sheets,  contains  ten  sets  of  categories.  As  is  typical  in 
meta-analyses,  an  extensive  set  of  conventions  for  coding  studies  was  also 
developed.    These  are  presented  in  Appendix  C 

The  coding  conventions  begin  with  general  instructions  for  the  coders. 
Included  in  these  directions  are  such  items  as  how  to  use  the  various 
supplementary  sheets  (for  identifying  effect  sizes,  indicating  that  further 
information  on  a  study  should  be  requested,  and  making  comments  on  the  study 
or  on  the  use  of  conventions  that  might  later  help  in  interpreting  the 
results),  what  to  do  with  blank  spaces,  how  to  complete  the  checklist  at  the 
beginning  of  the  coding  instrument,  how  to  handle  multiple  reports  of  a 
single  study,  and  how  to  round  decimals  in  computing  effect  sizes. 

Instructions  for  coding  individual  categories  were  also  provided.  The 
conventions  pages  are  numbered  for  the  convenience  of  the  coder  to  indicate 
the  set  of  categories  being  discussed,  as  well  as  the  page  number  on  the 
coding  instrument  where  the  category  being  discussed  is  located.  For 
example,  '*A-2/CI  1-2"  indicates  that  the  page  is  the  second  page  of  the 
coding  conventions  for  Section  A  of  the  coding  instrument  (A-2)  and  that  the 
categories  being  discussed  are  on  pages  1  and  2  of  the  coding  instrument  (CI 
1-2). 
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In  addition  to  the  conventions  for  coding  categories,  there  is  a  section 
on  computing  effect  sizes  and  a  section  of  addenda  added  to  the  conventions 
for  clarification  as  the  coding  progressed  and  items  of  concern  were  raised 
that  could  not  be  handled  easily  with  revisions  in  the  coding  conventions. 
All  told  there  are  approximately  90  pages  of  conventions,  plus  10  pages  of 
effect  size  computatic«n  information  and  5  pages  of  addenda,  A  discussion  of 
the  sets  of  coding  categories  follows, 

A,    General  Information^ 

As  the  title  of  this  section  of  the  coding  instrument  suggests,  it 
contains  categories  to  gather  miscellaneous  but  important  information  about 
the  studies.  Categories  A.4,  and  A.5,  call  for  the  recording  of  information 
about  the  year  in  which  the  report  was  made  available  and  the  type  of  report. 
These  have  been  common  pieces  of  information  gathered  in  meta-analyses 
because  of  interest  in  any  trends  in  research  over  time  and  in  whether 
findings  are  related  to  how  the  author  chooses,  or  is  allowed,  to  make  public 
his  or  her  results. 

The  number  of  effect  sizes  for  each  primary  report  (Category  A.6,a,)  was 
recorded  to  provide  ready  information  during  data  analysis.  By  the  same 
token,  each  effect  size  was  assigned  an  identification  number  for  convenience 
of  analysis. 

As  Jackson  (1980)  and  Green  and  Hall  (1984)  have  noted,  those  doing 
iu3ta-analyses  have  tended  to  focus  on  the  main  effects  of  studies.  These  are 
what  we  have  defined  as  primary  effect  sizes  (Level  1  ip  Category  A,6,c,), 
Although  main  effects  were  to  be  the  primary  focus  of  analysis,  it  was 
decided  that  in  gathering  data  for  this  integrative  review,  we  would  also 
code   information   to  allow  the  possibility  of  analyzing   the  data  for 
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interactions.  Secondary  effect  sizes  are  those/  then,  that  involve 
interaction  effects  or  comparisons  of  treatment  condition  within  levels  of  a 
classification  independent  variable  (in  this  case/  the  classification 
variables  of  gender/  testing/  age/  personality/  and  prior  contact  with 
disabled  persons  were  selected  because  of  their  assumed  importance  in  the 
literature).  Category  A.6.d.  "Type  of  comparison"  was  included  to  separate 
out  the  basic  types  of  primary  comparisons  (codes  1-4)  and  secondary 
comparisons  (codes  5-15). 

The  next  three  categories/  dealing  with  populations  and  whether  the 
study  was  a  replication/  were  included  because  of  a  general  interest  in  the 
extent  to  which  authors  of  research  reports  define  their  target  and 
accessible  populations  and  to  which  replication  is  a  part  of  educational 
and  psychological  research  (Shaver  &  Norton/  1980a/  b). 

B.  Description  of  Sample(s). 
A  number  of  the  characteristics  of  the  subjects  (Ss)  in  the  samples  used 
for  primary  research  were  deemed  to  be  of  potential  importance  in  attempting 
CO  integrate  and  interpret  the  research  literature.  These  included:  sample 
size;  how  the  sample  was  selected  (Category  B.2./  based  on  the  categories 
that  Shaver  and  Norton  [1980a/  b]  used  in  their  reviews  of  the  research 
literature);  the  percentage  of  males  in  the  sample  (because  some  researchers 
have  found  females  to  have  more  positive  attitudes  toward  disabled  persons 
and  to  be  more  amenable  to  attitude  change  interventions);  the  context  within 
which  the  intervention  or  treatment  occurred  (e.g./  was  it  an  educational  or 
a  work  context);  in  the  case  of  an  educational  context/  the  educational  level 
(age)  of  the  students  and/  for  university  students/  major  (some  of  the  majors 
[Category  8.6.}  that  may  seem  surprising — such  as  philosophy — came  from  a 


ERIC  9f  '  0 


perusal  of  the  research  studies  in  the  pool  to  be  coded);  the  occupation  of 
non-student  subjects;  and/  the  Ss*  prior  experience  with  disabled  persons. 
An  additional  sample  attribute  of  interest  was  the  country  from  which  the 
subjects  came  (Category  B.9.). 

C     Treat men  t/ln ter ven  tion. 

Describing  the  attitude  modification  interventions  that  had  been 
investigated  was  a  primary  concern.  One  important  attribute  of  those 
interventions  was  whether  they  had  been  grounded  in  theory.  As  noted  in 
Chapter  2/  prior  reviews  (especially  Towner,  1985)  have  lamented  the  lack  of 
theoretical  bases  for  attitude  modification  efforts.  In  order  to  determine 
the  role  of  attitude  change  theory  in  the  studies  in  our  integrative  review, 
the  coding  instrument  included  categories  for  identifying  whether  an  explicit 
basis  for  the  treatment  could  be  identified  and,  if  so,  whether  it  was  based 
on  theory,  on  prior  research,  or  on  practical  experience  (C.I.). 

Categories  wer3  also  included  in  which  to  indicate  the  theory  that 
explicitly  or  implicitly  underlie  the  intervention  (C.2.a.).  Five  attitude 
change  theoretical  positions  were  identified,  based  on  the  summations  and 
discussions  of  theory  by  Insko  (1967),  Kresler,  Collins,  and  Miller  (1969), 
Thompson  (1975),  Triandis  (1971),  Wagner  and  Sherwood  (1969),  and  Zimbardo, 
Ebbesen,  and  Maslach  (1977).  (Also  see  Watts,  1984.)  The  five  categories  of 
theory  are:  (1)  Stimulus-response  and  behavioristic;  (2)  conditioning;  (3) 
consistency-equilibrium;  (4)  social  judgment;  and,  (5)  functional.  A 
category  was  also  included  for  scoring  a  theory  base  that  was  a  combination 
of  these  positions.  Conventions  for  coding  theoretical  base.^  are  included  in 
Appendix  C 
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The  use  made  of  theory  was  also  scored.  The  uses  to  be  coded  were: 
none  (mentioned/  but  not  used  in  presenting  the  design  or  discussing  the 
results)/  a  brief  or  well-developed  discussion  of  the  cheory  base/  use  of 
theory  only  for  post  hoc  interpretation/  or  whether  the  coder  had  to  infer 
the  underlying  theory  because  it  was  not  mentioned  in  the  report  (C.2.b.), 

Setting 

A  significant  element  of  treatment — that  is,  the  setting  in  which  it 
took  place/  whether/  for  example/  in  a  regular  classroom/  a  research 
laboratory/  or  an  institution  for  the  mentally  retarded —was  also  coded 
(C.3.). 

Treatment  Characteristics 

Se\^eral  categories  were  devoted  to  describing  the  intervention 
techniques.  Based  on  the  prior  review  of  literature/  three  predominant 
approaches  had  been  identified.  These  were:  the  conveying  of  information 
about  disabilities  and  disabled  persons;  providing  direct  contact  with 
disabled  persons;  and,  providing  vicarious  disability  experiences/  such  as 
through  simulations.  In  addition/  combinations  of  those  three  major 
techniques  were  used.  Another  technique/  related  to  the  presentation  of 
information/  was  the  use  of  persuasive  communications.  And/  a  few 
researchers  used  positive  reinforcement  or  systematic  desc^nsitization  in  an 
effort  to  modify  attitudes.  Each  of  these  techniques  was  included  in 
Category  B.4, /  along  with  codes  for  control  and  placebo  groups. 

If  the  conveying  of  information  was  coded  as  the  treatment  or  a  major 
part  of  the  treatment/  the  type  of  information  and  delivery  mode  were  coded 
in  Categories  C,l,a.(l)  and  CI. a. (2).  If  direct  contact  was  the  treatment 
technique  or  a  major  part  of  it,  the  type  of  contact  was  coded  in  Category 
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B.4.b.  Types  of  vicarious  experience  were  coded  in  Category  B.4.C./  and 
types  A)f  positive  reinforcement  and  types  of  persuasive  messages  coded  in  tho 
Categories  9,4.d*  and  B.4.e. 

It  should  be  noted  that  any  description  of  the  treatment  in  the  report 
was  used  in  coding,  not  the  label  attached  to  the  treatment  by  the  report 
author(s).  This  was  particularly  important  in  studies  of  contact  as  an 
attitude  modification  technique.  As  Makas  (1986)  has  noted,  some  researchers 
(e.g.,  Donaldson,  1974;  Sadlick  &  Penta,  1975)  have  labeled  media 
presentations  of  disabled  persons  as  "contact".  We  did  not  code  those  as 
contact  treatments,  but  as  ure  of  persuasive  messages  and  vicarious 
experience,  for  example/  as  was  deemed  appropriate. 

Another  significant  aspect  of  an  attitude  change  treatment  is  the 
intended  attitude  target.  In  this  case,  the  question  was  attitudes  toward 
persons  with  what  disabilities  were  to  be  modified  by  the  intervention?  That 
was  coded  in  Category  B.5. 

A  potentic.lly  important  methodological  dimension  of  a  treatment  (coded 
ia  Category  B.O.)  is  whether  it  was  conducted  by  the  experimenter  himself  or 
herself,  by  project  assistants,  or  by  regular  nonproject  personnel  in  their 
work  detting/  such  as  regular  classroom  teachers  (Category  B.6.).  As  the 
outcomes  of  treatments  or  interventions  could  covary  with  duration  (see, 
e.g.,  Makas/  1986).  Whether  information  o.i  length  of  treatment  was  available 
was  coded  and/  if  available,  the  duration  (the  total  number  of  hours  of 
treatment  was  of  particular  interest)  was  entered  (Category  B.7.). 

Treatment  Verification 

One  of  the  major  problems  affecting  the  interpretation  of  primary 
resee'^ch  studies  is  the  frequent  lack  of  verification  that  t'  ^  treatment  was 
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actually  implemented  as  intended  (Cook  &  Campbell,  1979;  Hunter  et  al.,  1982, 
pp.  95-96;  Ladas,  1980;  Shaver,  1983),  Consequently,  a  sec  of  categories 
(B.8.  through  B.8.e.)  were  devoted  to  treatment  verification.  Coding 
included  whether  the  repr :t  contained  any  indication  that  treatment 
implementation  was  verified  and,  if  so/  what  type  of  verification  took  place, 
how  it  was  reported,  and  the  degree  of  implementation  that  was  claimed  by  the 
author(s).  In  addinion/  the  coder  estimated  the  extent  to  which 
implementation  actually  took  place  as  intended;  and,  because  of  the 
importance  accorded  replication  in  discussions  of  research  (Shaver,  1979a; 
Shaver  &  Norton*  1980af  b) ,  a  category  was  included  for  indicating  whether 
the  description  of  treatment  in  the  report  would  provide  cin  adequate  oasis 
for  replication  should  another  researcher  wish  to  do  so. 

Treatment  Validity 

It  is  possible  that  a  treatment  which  was  executed  exactly  as  intended 
might  not  validly  represent  the  anticipated  independent  variable.  That  is, 
the  experiences  of  the  subjects  may  not  be  what  was  intended  by  tY 
researcher.  This  concept  of  treatment  validity  is  similar  to  Cook  and 
Campbell's  (1979)  concept  of  the  construct  validity  of  presumed  causes  and 
effects.  Using  this  concept/  one  asks:  If  the  intervention  operations  are 
presumed  to  represent  a  construct  with  an  assumed  cause  and  effect 
relationship  to  attitudes,  are  there  other  confounding  variables  which  would 
"invalidate  cause  and  effect  inferences?  Cook  and  Campbell  (p.  60)  refer  to 
the  Hawthorne  effect  as  one  such  confounding  variable.  The  question  is 
whether  an  increase  in  performance  was  due  to  th^  treatment  as  conceptualized 
by  the  researcher  or  due  to  the  Ss*  awareness  of  the  attention  accorded  them? 
The  coding  instrument  contains  categories  for  coding  the  potential  impact  of 


confounding  variables  that  might  covary  with  the  independent  variable/  thus 
threatening  the  validity  of  the  treatment.  These  potentially  confounding 
variables  (which  are^  along  with  the  Hawthorne  effect/  the  extent  to  which 
the  treatment  was  actually  implemented/  the  John  Henry  effect/  treatment 
diffusion/  Ss*  dissatisfaction  or  resentment/  novelty  or  disruption  effects/ 
experimenter  effects  and  expectations/  the  confounding  of  treatment  and 
experimenter/  test  by  treatment  interaction/  and  multiple  treat ment 
interference)  constitute  the  ten  subcategories  of  Category  B.9.  Based  upon 
the  coding  of  those  subcategories/  the  coder  made  an  overall  judgment  as  to 
whether  the  general  treatment  validity  was  "excellent"/  "fair"/  or  "poor" 
(B.9.b.). 

Our  subcategories  for  treatment  validity  are  frequently  included  in 
lists  of  external  validity  {e.g.,  Bracht  &  Glass/  1968;  Campbell  &  Stanley/ 
1963).  This  makes  conceptual  sense/  as  external  validity  —that  is,  the 
extent  to  which  one  can  generalize  results — depends  to  a  large  extent  upon 
whether  the  treatment  was  validly  implemented.  If  there  are  serious  threats 
to  treatment  validity/  the  person  who  tries  to  generalize  from  the  research 
results  is  left  with  the  puzzling  conundrum  as  to  what  it  is  that  might  be 
generalized  to  other  treatment  situations. 

Can't  Tell  Option 

It  should  be  notevf  that  contrary  to  what  is  common  with  meta-analysis 
coding  systems/  in  the  treatment  validity  subcategories/  and  elsewhere  in  the 
coding  instrument/  the  coder  was  provided  with  "can't  tell"  as  an  option*  In 
many  meta-analyses/  if  study  characteristics  or  their  effects  are  not 
described  in  a  report/  coders  arc  asked  to  make  a  reasoned  judgment  as  to  the 
nature  of  the  study  and  any  threats  to  validity.  Sometimes  coders  are 
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instructed  to  assume  that  no  threat  was  present  if  there  is  no  evidence  of 
the  threat.  Bullock  and  Svyantek  (1985)  ha^'e/  however/  indicated  the 
importance  of  coding  for  missing  information  to  indicate  when  adeq'iate  data 
were  not  reported  to  code  a  category.  Thut  is  the  strategy  we  adopted.  The 
coder  waS/  however/  forced  to  make  a  decision  about  general  treatment 
validity/  with  the  number  of  "can't  tell"  codings  a  factor  in  that  decision 
(see  coding  convention  for  Category  C.9.k.   in  Appendix  C). 

Mainstreaming  Studies 

As  noted  in  Chapter  2/  Oiainstreaming  studies  that  included  explicit 
concern  with  the  modification  of  attitudes  toward  disabled  persons  were 
included  in  the  accessible  population  of  reports.  As  a  particular  type  of 
treatment/  mainstreaming  was  accorded  a  major  category  (B.lO.)  with  15 
subcategories  to  assess  the  type  of  instruction  in  mainstreamed  classes/  any 
special  personnel  support  or  special  training  for  disabled  students/ 
nondisabled  students/  or  parents  that  was  provided/  the  length  of  the 
mainstreaming  treatment/  and  the  Ss  for  whom  attitude  outcomes  were  assessed. 

D.  Dependent  Measures. 
A  criticw*^  aspect  of  primary  research  that  ought  to  be  addressed  in  an 
integrative  review  of  literature  is  the  instruments  usee'  to  assess  the 
dependent  variables  which  are  of  interest  as  outcomes.  The  validity  of 
scores  is  a  primary  concern.  The  centrality  of  appropriate  assessment  to  the 
drawing  of  conclusions  about  treatment  effects  is  indicated  by  Cook  and 
Campbell's  (1979/  pp.  60-61)  decision  to  include  whether  the  "proposed 
dependent  variables  .  .  .  tap  into  the  factors  they  are  meant  to  measure"  as 
a  part  of  the  cause-and-ef feet  construct  validity  of  experimental  designs. 


Among  the  prior  reviews  of  research  on  modifying  attitudes  toward  disabled 
persons,  Towner  (1985)  in  particular  raised  questions  about  the  validity  of 
attitude  assessments,  noting  that  many  primary  researchers  did  not  even 
define  the  construct  of  attitude  which  underlay  their  resecrch* 

Reactivity  of  assessment  is  a  concern  in  the  validity  of  attitude  scores 
(Matkin,  Hafer,  Wright,  &  Lutzker,  1981;  Wilson  &  Putnam,  1982). 
Consequently,  coding  included  not  only  whether  test  validity  was  discussed 
und  the  type  and  source  of  validity,  if  reported,  but  an  estimate  of  the 
reactivity  of  the  measure  as  well.  The  coder  then  mac'a  an  overall  judgmen*- 
of  the  adequacy  of  instrument  validity  ("low",  "moderate",  "high"),  with  the 
convention  that  highly  reactive  measures  were  automatically  scored  no  higher 
than  "moderate". 

Underlying  validity,  of  course,  are  the  matters  of  the  reliability  of 
scores,  the  types  of  instrumentation  used,  and  how  data  collection  and 
scoring  were  carried  out  (for  example,  were  data  collectors  and  scorers  blind 
to  treatment  group  membership  when  that  was  important?). 

These  various  attributes  of  instrumentation  and  data  collection  were 
ceded  in  Section  D.  of  the  coding  instrument.  Categories  included,  in 
addition  to  the  attributes  mentioned  above,  whether  the  posttest  was 
administered  immediately  at  the  conclusion  of  treatment  or  delayed  for  more 
than  a  day,  whether  there  was  iollow-up  posttest ing,  and,  if  available,  the 
time  from  intervention  conclusion  to  porsttesting. 

B.     Internal  Validity. 
How  to  deal  with  the  quality  of  the  primary  research  reports  is  a  major 
issue  raised  in  the  literature  on  quantitative  integrati  'e  reviews.  Bangert- 
Drowns  (1986)  has  provided  an  excellent  summary  of  the  debate  over  study 
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quality.  While  Glass  (1976,  1977)  advocated  that  all  studies  be  included  in 
a  quantitative  review/  regardless  of  quality;  so  that  the  covariation  of 
study  outcomes  with  study  quality  could  be  investigated/  others  (e.g./ 
Eysenck/  197c5;  Mansfield  &  Busse/  1977)  have  raised  questions  about  that 
strategy.  In  particular/  the  investigation  of  study  quality-outcome 
covariation  is  no*  likely  to  be  fruitful  if  there  is  a  constant  bias  or  flaw 
in  the  research  in  a  field  (Bryant  &  Workman/  1985/  pp.  635-636;  Shaver/ 
1979a)  or  if  there  \s  a  lack  of  variability/  in  particular  a  lack  of  well- 
designed  studies  against  which  to  compare  the  outcomes  of  studies  with 
methodoloc  .cal  flaws  (Bangert-Drowns/  1986). 

AS  noted  in  Chapter  2,  ttie  clear  intent  of  this  study  was  to  follow  the 
lead  of  Glass  end  investigate  the  relationship  between  study  quality  and 
outcomes.  That  was  the  thrust  of  the  sets  of  categories/  discussed  above/ 
chat  deal  with  treatment  validity  and  the  dependent  measures.  It  was  also 
the  purpose  of  the  set  of  categories  dealing  with  internal  validity.  In 
these  categories/  each  study  was  coded  according  to  the  particular  design 
used/  the  method  of  assigning  subjects  to  groups/  and  the  extent  to  which 
there  were  threats  to  internal  validity/  based  on  the  standard  Campbell  and 
Stanley  (1963)  listing/  with  one  important  addition.  The  total  treatment 
validity  score  (Category  C.9.k.)  was  included  as  a  potential  threat/  on  the 
:>unds  that  it  is  nonsensical  to  discuss  whether  an  outcome  can  be 
attributed  to  the  treatment  (i.e./  to  consider  internal  validity  questions) 
if  the  treatment  was  not  implemented  adequately. 

Although  some  reviewers  using  quantitative  techniques  add  up  scores  on 
individual  subcategories  to  obtain  a  total  quality  of  internal  validity  or 
methodology  score  (e.g./  Bullock  &  Svyantek/  1985)/  that  approach  was  not 
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adopted  for  this  review.  Coders  were  given  guidelines  (see  the  conventions 
in  Appendix  C)  for  arriving  at  a  judgment  of  general  internal  validity 
("low",  "medium",  "high")  based  on  the  understanding  that  internal  validity 
is  not  a  matter  of  simply  adding  up  the  category  scores  for  individual 
threats,  because  any  one  threat  may  be  fatal.  That  is,  a  study  may  be  well- 
designed  in  most  regards,  receiving  high  scores  on  every  category  of  internal 
validity  but  one,  and  thereby  lack  internal  validity.  For  example,  if  the 
experimental  groups  were  exposed  to  different  histories  that  had  clear 
potential  for  differential  effects  on  outcomes,  then  internal  validity  is  lev/ 
regardless  of  how  well  other  threats  have  been  controlled.  Although  the  coder 
had  available  a  "can't  tell"  option  for  coding  specific  threats  to  internal 
validity,  a  forced  choice  was  made  as  to  whether  the  general  internal 
validity  was  low,  medium,  or  high. 

F.  Results. 

The  major  concern  in  coding  th'^  results  of  the  primary  research  studies 
reviewed  was  to  record  the  effect  si2e(s)  for  each  study.  In  addition^ 
whether  or  not  the  result  was  statistically  significant  at  the  .05  level  was 
coded  (Category  F.I.).  Coding  also  included  whether  the  authors  qualified 
neir  conclusions  about  treatment  effectiveness  in  terMs  of  possible  threats 
such  as  the  type  of  sample  and  design  flaws  (Shaver  &  Norton,  1980a,  b)  and 
whether  the  author  deemed  the  treatment  to  have  been  effective  or  not 
(Categories  F.2.a.  and  b.). 

Effect  S^i^e  Availability 

In  Category  F.3.a.,    the  coder  indicated  whether  an  effect  size  was 
available.  As  was  mentioned  earlier  in  this  chapter,   a  positive  effect  size 
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may  actually  reflect  no  change  or  a  negative  change  by  the  experimental  group 
thdt  was  accompanied  by  a  greater  negative  change  by  the  control  group. 
Consequently/  if  an  effect  size  was  available,  it  was  coded  to  indicate 
whether  the  change  for  the  treatment  group  was  positive/  on  the  one  hand/  or 
nil  or  negative/  on  the  other.  In  addition/  if  no  effect  size  was  available 
but  the  level  of  statistical  significance  (above  or  below  ,05)  could  be 
determined/  whether  each  difference  on  a  relevant  outcome  measure  favored  the 
treatment  or  the  comparison  group  was  coded. 

Effect  Sizes 

Next/  the  coder  entered  any  effect  sizes  that  could  be  computed  and 
coded  information  abouc  the  sources  of  the  statistics  used  in  computing  or 
estimating  each  effect  size.  The  procedures  to  be  followed  in  selecting 
statistics  and  computing  or  estimating  effect  sizes  are  detailed  in  the 
COMPUTATION  OF  EFFECT  SIZES  and  CONVEOTIONS  ADDENDA  sections  of  the  coding 
conventions  (see  Appendix  C), 

Our  major  indicator  of  effect  size  was  Glass's  Delta*  (Glass/  McGaw/  & 
Smith/  1981)/  which  we  labeled  D,  To  compute  a  D,  the  difference  between  the 
experimental  mean  and  the  control  group  mean  is  divided  by  a  standard 
deviation/  if  available/  which  is  free  of  treatment  effects,  D  is  in 
contradistinction  to  Cohen's  (1977)  d  in  which  a  standard  deviation  based  on 
the  within  croup  variance  for  all  groups  pooled  is   used   for  the 


*Some  reviewers  treat  Delta  as  if  it  were  the  effect  size,  rather  than  one 
type  of  effect  size  {see,  e.g.,  Walberg/  1986/  p,  216),  Walberg  also  seems 
to  consider  Deltas  and  correlation  coefficients  to  be  equivalenc  (see  his 
Table  7,2/  pp,  218-19),  However/  they  are  not  the  same  metric.  For 
example/  a  point  biserial  coefficient  will  be  considerably  less  than  the 
Delta  for  the  same  result  (if  ID  =  ,5/  rpj^  =  ,24;  if  D  =  1/  r^^^  =  ,45;  if  D  = 
2/  r  b  =  .71;  see  Appendix  G),  And/  correlation  coefficients  from 
correlational  studies  would  have  very  different  meanings  than  either  Ds  or 
point-bi.serial  coefficients  from  experimental  designs, 
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standardization  of  mean  differences.  Glass/  McGaw/  and  Smith  recommend  use 
of  the  control  group  standard  deviation,  hs  our  purpose  was  to  obtain  the 
most  stable  estimate  of  variance  in  the  untreated  population,  we  extended 
Glass's  Delta  by  pooling  the  variances  available  for  untreated  groups — 
including  treatment  group  pretest  and  control  group  pre  and  posttest 
varicinces — to  obtain  the  standard  deviation  by  which  the  difference  between 
means  was  standardized. 

Standard  deviations.  Raw  score  standard  deviations,  or  estimates  of 
chose  standard  deviations,  were  used  in  computing  Ds.  When  the  only  standard 
deviation  available  in  a  report  was  from  an  analysis  of  covariance  or  was  a 
standard  deviation  for  gain  scores,  an  unadjusted  standard  deviation  was 
estimated  (Glass  et  al.,  1981;  McGaw  &  Glass,  1980). 

As  Kulik  and  Kulik  (1986)  have  pointed  out,  the  decision  to  use  raw 
score  standard  deviations  for  computing  effect  sizes,  rather  than  using 
standard  deviations  from  which  major  sources  of  variation  have  heon  removed, 
for  example,  by  covariance.  regiression  analysis,  or  blocking — is  not  a 
trivial  matter.  Effect  sizes  computed  with  adjusted,  or  reduced,  standard 
deviations  (called  "operative"  effect  sizes)  will  "vary  not  only  as  a 
function  of  size  of  the  raw-score  treatment  effect  but  also  as  a  function  of 
the  experimental  design  used  to  investigate  the  etfect"  (p.  7).  Although 
operative  effect  sizes  are  useful  in  statistical  power  analyses,  they  are  not 
directly  comparable  from  study  to  study  unless  the  same  research  design  and 
analysis  were  used  in  each.  However,  effect  sizes  computed  with  raw  score 
standard  deviations  (called  "interpretable"  effect  sizes)  can  be  interpreted 
along  a  common  scale  because  they  are  conceptually  equivalent  to  one  another. 
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Estimating  Ds.  When  the  means  or  the  standard  deviation  for  computing  a 
D  was  not  available/  but  the  result  from  a  test  of  significance/  such  as  an 
F-ratio  or  t-ratio/  was  available/  _D  was  estimated  based  on  proCv^dures 
spelled  out  in  Glass  et  al.  (1981). 

Variations  on  D  computations.  Generally/  the  procedures  followed  in 
selecting  statistics  and  computing  or  estimating  Ds  were  those  suggested  by 
Glass  (1977;  Glass  et  al./  1981)  and  elaborated  in  worksheets  prepared  by 
Karl  White/  of  Utah  State  University's  Early  Intervention  Research  Institute/ 
for  use  in  his  own  meta-analytic  work  and  in  conducting  workshops  on  meta- 
analysis. However/  how  to  code  the  information  from  a  Solomon  four-group 
design  had  not  been  addressed  in  the  literature/  to  the  best  of  our 
knowledge.  The  Solomon  four-group  design  is/  of  course/  a  combination  of  two 
designs — a  pretest-posttest /  control  group  design  and  a  pooLtest-only/ 
control  group  design.  It  was  tempting/  on  those  grounds/  to  obtain  two  Ds 
from  each  such  design.  However/  that  would  have  compounded  the  problem  of 
nonindependent  multiple  us  from  individual  studies.  Consequently/  we 
computed  the  two  £s  and  pooled  them  (weighting  by  £/  if  the  design  was  not 
balanced)  to  obtain  one  effect  size  (see  Appendix  C/  CONVENTIONS  ADDENDA/ 
#7). 

Another  procedural  variation  had  to  do  with  results  reported  as 
percentages.  Glass  (1977/  pp.  369-70:  Glass  et  al./  1981/  pp.  35/  136-14(5) 
has  recommended  the  use  of  the  probit  transfoimation  when  data  are  reported 
in  terms  of  dichotomies.  However/  this  transformation  (which  involve  taking 
the  differer^ce  between  the  standard  normal  deviates  for  the  percentages  in 
the  two  groups  being  compared)/  is  subject  to  technical  problems  (Glass  et 
al./  1981/  p.  138)  and  in  practice  often  yields  suspiciously  high  estimates 
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of  D.  An  alternative  procedure  is  to  cast  the  proportions  for  the  two  groups 
in  a  two-by-cwo  table,  compute  a  Phi  coefficient  (or  compute  it  from  the  chi- 
square  value  for  such  a  table,  if  reported),  and  then  use  that  coefficient  to 
estimate  D  (see  Appendix  C,  COMPUTATION  OF  EFFECT  SIZES).  The  Phi 
coefficient  procedure  appeared  to  yield  more  reasonable  estimates  than  did 
prob\t  transformations,  and  it  was  used  in  the  tew  instances  (12  out  of  the 
644  Ds  used  in  our  main  analysis)  in  which  percentages  (dichotomies)  were 
reported. 

Source  of  D.  When  a  D  could  be  computed,  the  source  (whether  calculated 
or,  for  example,  estimateo  from  a  t-ratio),  the  scale  of  mean  differences 
(that  is,  whether  between  raw  gain  scores — the  preferred  unit — posttest 
differences,  or  covariance  adjusted  means),  and  type  of  thf  standard 
deviation  (whether  a  control  group  standard  deviation,  a  pooled  standard 
deviation,  or  an  estimated  standard  deviation)  were  coded  (Categories 
F.3.b.2]/  3],  and  4]). 

Correlations.  Effect  sizes  were  also  coded  as  correlations  because,  for 
some  people,  an  independent-dependent  variable  relationship  expressed  in 
terms  of  a  point  biserial  coefficient  (especially  if  squared)  is  more 
amenable  to  interpretation  than  when  expressed  as  a  standardized  mean 
difference.  When  the  point  biserial  coefficient  was  not  reported,  which  was 
typical,  D  was  computed  first  and  the  D  converted  to  a  point  biserial 
joefficient  (see  the  Correlation  Coefficients  section  of  the  OOMPUTATIOM  OF 
EFFECT  SIZES  se^t-ion  of  the  conventions  in  Appendix  C). 

In  analyzing  the  data  and  reporting  the  results,  we  focused  almost 
exclusively  on  Ds:  they  are  commonly  used  and  readily  interpretable,  and  the 
magnitude  of  the  analysis  made  it  unfeasible  to  analyze  redundant  indicators 
of  effect  size.    A  table  for  transforming  Ds  to  point  bi;5erial  coefficients 
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is  available  in  Appendix  G  for  those  who  think  more  readily  in  terms  of 
correlations  and  variance  explained* 

For  interaction  effect  sizes  (which  were  secondary  effect  sizes),  a  D 
could  not  be  computed.  Eta^,  usually  available  or  computable  from  ANOVA 
tables,   was  used  as  an  appropriate  indicator  of  effect  size. 

Variance  ratios.  Researchers  often  focus  in  their  analyses  on  central 
tendencies  (usually  means),  disregarding  the  possibility  that  treatments  may 
have  affected  variability  in  scores.  It  did  seem  possible  that  attitude 
modification  treatments  would,  in  some  cases,  increase  or  decrease  dispersion 
because  of  differential  effects  _i  people,  depending,  for  example,  on  their 
pretreatment  attitudes  (see,  e.g..  Amir,  1969).  Consequently,  variance 
ratios  were  computed  for  any  primary  effect  sizes  tor  which  variances  were 
available. 

Median  effect  sizes.  Considerable  concern  has  been  expressed  in  the 
literature  about  Glass's  strategy  (see,  e.g..  Glass  et  al./  1981)  of  basing 
meta-analyses  on  effect  sizes  rather  than  on  studies,  when  individual  studies 
yield  multiple  effect  sizes  (see  Bangert^Drowns/  1986,  for  an  excellent 
synopsis  of  the  debate  over  the  nonindependence  of  multiple  effect  sizes  from 
individual  studies).  Some  review  methodologists  recommend  obtaining  a  single 
effect  size  for  each  study  or  for  each  type  of  outcome  in  each  study. 
Rosenthal  (1984),  for  example,  has  recommended  the  use  of  the  median  effect 
size  as  a  stable  although  conservative  estimace  of  the  effect  size  for  an 
individual  study. 

Rosenthal's  r-^commendation  confirmed  our  decision  to  compute  a  median 
overall  £  for  the  posttest  outcomes  of  ^ach  study  and  for  each  set  of  follow- 
up  posttest  outcomes  that  was  available.     Median  D*s  were  computed  only  for 
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primary  effect  sizes.  If  within-study  replications  were  reported,  medians 
were  computed  separately  for  each  replication.  In  addition,  a  median  D  was 
computed  for  each  type  of  assessment  (Category  D.6.)  used  in  the  study/  for 
the  posttest  and  any  follow-up  posttest  outcomes,  and  for  within-group 
replications. 

As  will  be  indicated  in  our  discusbion  of  results,  use  of  the  median 
values  in  our  analyses  was  neither  easy  nor  particularly  fruitful.  Use  of  a 
median  effect  size  is  particularly  feasible  when  reviewing  studies  which  do 
not  have  multiple  types  of  outcomes  or  assessments/  intra-study  replications, 
and/or  repeated  posttesting.  Our  set  included  all.  Of  particular  concern 
were  the  studies  with  multiple  types  of  assessment:  they  yielded  multiple 
medians,  which  made  analysis  difficult;  but  the  alternative,  the  overall 
median  £,  was  of  questionable  meaning.  Consequently,  median  Da  were  used  in 
the  analysis  primarily  as  a  check  on  the  results  obtained  with  individual  Ds. 
That  is,  some  comparisons  were  made  to  determine  if  the  results  would  have 
been  strikingly  different  using  median  rather  than  individual  Ds.  The  lesson 
gained  was  that  despite  the  recommendation  to  use  median  Ds  in  analyses,  much 
work  needs  to  be  done  on  how  to  best  compute  and  identify  them  for  easier  use 
in  analyses  that  involve  a  large  number  of  studies,  with  many  having  multiple 
assessments. 

Supplemental  Information. 
As  discussed  above,  treatment  verification  and  treatment  validity 
categories  were  included  in  the  coding  instrument  to  assess  the  extent  to 
which  the  independent  variable  was  executed  as  ^ntencufi  and  with  construct 
validity.  When  the  conveying  of  information  about  disabilities  and  persons 
with  disabilities  is  the  technique  for  modifying  attitudes  under 
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investigation/  a  further  check  on  the  Lreatment  should  be  undertaken.  That 
is,  it  should  be  determined  whether  the  subjects*  knowledge  of  the 
information  did  increase.  Consequently,  information  treatment  studies  were 
coded  according  to  whether  data  on  information  gain  were  reported,  how  those 
data  were  reported,  and  the  conclusions  that  could  be  drawn  in  terms  of  the 
amount  of  information  gain  (Categories  G.l.a.,  b.,  cj.  In  addition,  an 
effect  size  was  computed  for  the  information  assessment,  with  a  median  effect 
size  entered  if  there  was  more  than  one  measure  of  knowledge.  This 
information  was  entered  in  Category  G.l.d. 

In  addition,  it  became  clear  during  the  coding  that  we  were  dealing  with 
three  major  types  of  studios  that  should  be  distinguished:  course 
evaluations,  program  evaluations  (e.g.,  evaluations  of  graduate  programs  in 
rehabilitation  therapy),  and  experimental  treatments.  Also,  while  most 
classroom  mainstreaming  studies  compared  mainstreamed  versus  nonmainstreamed 
students,  it  also  was  anticipated  that  some  might  include  the  analysis  of 
data  to  determine  the  effects  on  "  :udents  who  had  been  exposed  to 
mainstreaming  for  varying  amounts  of  time.  These  types  of  studies  were  coded 
in  Category  G.2. 

H.    Coding  Summary. 
The  number  of  minutes  spent  coding  each  study  was  recorded,  as  well  as 
who  did  the  coding. 

I.     Prior  Contact  Coding  Sheet. 
As  noted  above,  it  was  decided  that  it  would  be  important  to  have  more 
information  about  assessments  'f  amounts  and  types  of  Ss*  prior  contact  with 
disabled  persons  than  was  obtained  with  Category  B.8.    That  information  was 
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obtained  using  the  Prior  Contact  Coding  Sheet.  Included  was  a  code  (Category 
to  indicate  whether  prior  contact  was  implicit  (e.g./  psychiatric 
nurses  who  had  been  working  in  mental  hospital  wards  prior  to  an  inservice 
course)  even  though  the  researcher  did  not  deal  explicitly  with  prioL  contact 
as  a  moderating  variable.  Also  coded  were  how  priDr  contact  was  assessed/ 
including  the  typa  of  prior  contact  setting  and  the  definition  of  degree  of 
contact/  if  available;  the  uses  made  of  any  prior  contact:  information;  and/ 
the  type  of  disability  with  which  prior  contact  had  occurred.  In  addition/ 
the  minutes  spent  in  coding  this  additional  information  was  entered. 

J.  Contact  Coding  Sheet. 
In  the  discussion  of  prior  reviews  of  the  literature  on  modifying 
attitudes  toward  disabled  persons  in  Chapter  2  (set  Table  4  in  particular)/ 
we  indicated  that  there  is  consensus  that  contact  per  se  is  not  likely  to 
produce  more  positive  attitudes  toward  persons  with  disabilities  and  may  even 
reinforce  negative  attitudes.  The  elements  in  contact  that  are  liKely  to 
affect  whether  contact  results  in  more  positive  attitudes  were  alluded  to  by 
reviewers  such  as  Donaldson  (1980)/  Harth  (1973)/  Segal  (1978)/  and  Westwa')d 
et  al.  (1981).  The  summary  of  those  factors  by  Yuker  et  al.  (1970)  is 
frequently  cited  in  Lne  literature/  and  Yuker  (1986)  has  provided  a  more 
recent  synopsis.  The  theoretical  bases  for  the  enumeration  of  factors 
related  to  the  effects  of  contact  on  attitudes  toward  disabled  persons  are 
typically  found  in  Allport  (1954)  and  Amir  (1969).  In  a  recent  paper  not 
available  to  us  until  the  final  report  was  in  preparation/  Makas  (1986)  has 
discussed  those  factors  again.  (She  presents  a  well -developed  argument  that 
the  inconsistent  results  of  the  disabilities-contact  attitude-modification 
research  are  not  due  to  inadequacies  in  the  theory  but  to  methodological 
inadequacies  in  the  research  studies.) 
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A  contact  coding  sheet  was  developed  based  on  the  prior  discussions  of 
the  theoretically  important  dimensions  of  cont-^ct  in  modifying  attitudes^ 
Categories  were  included  to  gather  information  on  the  relative  status  of  the 
nondisabled  Ss  and  the  disabled  persons  with  whom  they  had  contact/  including 
ager  educational-vocational  discrepancies/  and  helping  relationships 
(Categories  J.5.a.~d.).  An  overall  judgment  of  status  was  coded/  based  or*  the 
prior  subcategories  and  any  indicators  of  social  status  available  in  the 
report . 

Aspects  of  the  type  of  contact  were  also  coded.  Included  were  the  extent 
to  which  contact  was  voluntary/  the  extent  of  intimacy  involved/  the  extent 
to  which  cooperation  and/or  competition  was  involved  in  interactions/  the 
extent  to  which  the  interaction  itself  produced  reinforcement  and/or  the 
extent  to  which  the  nondisabled  Ss  were  reinforced  for  interacting  with 
disabled  persons/  the  extent  to  which  the  contact  situation  was  pleasant/  and 
the  extent  to  which  there  was  modeling  either  by  peers  or  significant  others 
of  positive  interactions  with  disabled  persons  (Category  7.a.-f*).  Also 
coded  was  the  existence  or  nonexistence  of  institutional/  authority/  or  peer 
support  for  positive  interactions  and  attitudes  (Category  8.)* 

The  characteristics  of  the  disabled  persons  may  have  an  impact  on  the 
outcomes  of  contact.  Coded  as  characteristics  of  persons  with  disabilities 
were  Che  type  of  disability  which  they  had/  whether  they  purvey^  negative 
stereotypes/  and  the  extent  to  which  they  were  likely  to  be  viewed  as 
competent  by  the  nondisabled  Ss  (Categories  9.a.-*e.). 

The  characteristics  of  the  nondisabled  Ss  are  also  theorized  to  be 
important  factors  in  the  outcomes  of  contact.  In  Category  lO.a.  through 
lO.d./  we  coded  whether  personality  attributes    of  nondisaoled  Ss  or  their 
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prior  attitudes  toward  disabled  persons  were  assessed  and/  if  they  were  and 
the  results  analyzed,  what  the  relationships  were  to  post-treatment  attitudes 
toward  persons  with  disabilities.  (Prior  contact  with  disabled  persons  by 
the  nondisabled  Ss,  another  potentially  potent  predictor  of  the  effects  of 
attitude  modification  treatments,  had  already  been  coded  with  the  prior 
contact  coding  sheet.)  Again,  the  minutes  spent  in  coding  these  categories 
were  also  entered  on  the  coding  sheet. 

ES  Information  Missing  Coding  Instrument 

As  noted  earlier,  in  most  meta-analyses  studies  are  rejected  if 
information  is  not  available  for  computing  effect  sizes;  we  decided,  however 
(and  later  found  that  Slavin  [1986]  concurred),  that  important  information 
would  be  lost  by  simply  discarding  those  studies.  Yet,  without  effect  sizes 
to  be  used  in  our  analyses^  it  did  not  seem  economical  to  code  the  "missing 
effect  size"  studies  as  completely  as  those  for  which  effect  sizes  were 
available.  Consequently,  an  ES  Infoririation  Missing  Coding  Instrument  was 
developed  by  striking  from  the  full  coding  instrument  categories  which  seemed 
likely  to  be  of  little  value  for  analysis  without  effect  size  data.  (The  ES 
Information  Missing  Codin*^  Instrument  is  included  in  Appendix  B.)  Studies 
without  effect  size  information  were  coded  only  if  information  about  the 
statistical  significance  of  results  was  available. 

Coding  Time 

The  time  to  be  spent  in  coding  is  of  concern  to  those  planning 
quantitative,  integrative  reviews.  In  planning  this  integrative  review,  we 
estimated  an  average  coding  time  of  3  hours  per  report,  based  on  the  meta- 
analysis coding  experience  of  staff  at  USU's  Early  Intervention  Research 
Institute.    That  figure  turned  out  to  be  a  slight  underestimate. 
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The  front  page  of  each  coding  instrument  has  spaces  for  entering  the 
times  at  which  s^joring  began  and  ended.  The  next  to  the  last  category  is 
"minutes  spent  coding".  Reliability  checks  were  not  kept  on  that  category 
and  it  must  be  recognized  that  times  entered,  although  seemingly  accurate  to 
the  minute,  were  often  rough  estimates.  A  major  contributor  to  unreliability 
was  the  interruptions  that  inevitably  occurred  as  raters  scored — another 
rater  needing  counsel,  or  telephone  calls  and  drop-ins  from  students, 
faculty,  or  other  personnel  as  concurrent  projects  and  university  business 
had  to  be  conducted. 

In  any  event,  as  can  be  seen  in  Table  6,  the  mean  scoring  time  of  175 
minutes  (2.9  hours)  for  the  main  coding  instrument  is  amazingly  close  to  the 
3-hour  estimate.  However,  with  the  mean  time  for  the  prior  Contact  Coding 
Sheet  (12  minutes/.2  hours)  and  the  Contact  Coding  Sheet  (21  minutes/.35 
hours)  added  on,  the  total  time  is  208  minutes  (3  hours  and  28  minutes,  or 
3.47  hours),  and  3  hours  is  a  13  percent  underestimate.  It  is  also  important 
to  note  that  the  coding  time  figures  presented  in  Table  6  are  probably 
underestimates  or  actual  time  spent  in  coding  the  reports.  They  tend  to 
reflect  the  time  actually  spent  in  coding  and  are  less  likely,  for  example, 
to  include  time  getting  ready  to  code — selecting  a  report,  perusing  it  to  be 
sure  it  is  relevant  and  that  information  necessary  for  coding  is  available 
(which  often  involved  consulting  with  other  staff  members),  and  discussion  of 
general  coding  issues  that  arose  as  raters  grappled  with  applying  the  coding 
categories  to  individual  reports. 

As  would  be  expected/  the  average  time  of  scoring  did  differ  among  types 
of  report.  Table  6  presents  that  information  for  the  main  coding  instrument. 
Combinations  understandably  took  the  longest  time  to  code  on  the  average  (200 

i  O 

111 


Table  6 

Time  to  Code  Reports  for 
197  T  X  C/  T  X  P/  Pre-posL  Studies 


Main 

Ceding 

Instrument 

Keporc  lyp© 

N 

Mean 
(min./hr. )^ 

Standard 
Deviation 
(min./hre )^ 

UVJULllClX 

OO/  1.  D 

UXoo€l La Lion 

lyD/ J. ^ 

yo/i.D 

Thesis 

7 

152/2.5 

48/. 8 

Convention  Paper 

5 

89/1.5 

35/.  6 

Unpublished  Report 

10 

176/2.9 

96/1.6 

Con±)ination'^ 

18 

200/3.3 

128/2al 

Total 

197 

175/2.9 

96/1.6 

Prior  Contact  Coding  Sheet 

Standard 

N  Mean 

Deviation 

197       12/. 2 

10/^2 

Contact  Coding  Sheet 

Standard 

N  Mean 

Deviation 

4/c      21/. 35 

13/. 21 

ERIC 


^Data  presented  are  minutes/hours. 

^Two  or  more  reports  of  the  same  study/  usually  a  dissertation- journal 
article  combination.    See  Chapter  4  for  more  detail. 

^This  coding  sheet  was  used  only  on  those  studies  in  which  contact  had  been 
coded  as  an  attitude  modification  technique  in  a  treatment  by  control/ 
treatment  by  placebo/  or  single-group/  pre-posttest  design. 
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minutes/3.3  hours)/  with  the  average  time  for  dissertations  very  close  (195 
minutes/3.2  hours)/  and  ccnvention  papers  taking  the  least  time  on  the 
average  ^89  minutes/i.5  hours).  The  larger  standard  deviation  for 
combinations  reflects  the  variety  of  associations  of  reports.  While  a 
combination  was  typically  a  dissertation  and  a  journal  article/  it  might  also 
have  been  a  journal  article  and  a  convention  paper/  or  two  articles. 

Rater  Reliability 

In  conducting  a  quantitative  review  of  research/  as  with  any  data 
collection  endeavor/  there  must  be  concern  with  the  reliability  of  the  data 
collected.  However/  as  Glass  and  his  associates  (1981/  p.  75)  have  noted/ 
the  situation  in  estimating  the  reliability  of  scores  is  somewhat  different 
in  meta-analys3s  than  that  in  usual  assessment  settings  where  score 
inconsistencies  can  reflect  both  lack  of  stability  in  the  phenomena  being 
observed  (or  assessed)  and  inconsistency  in  judgments  by  observers  (or 
assessors  or  raters).  Because  the  data  for  meta-analyses  come  from  written 
reports/  the  first  source  of  inconsistency  is  eliminated  and  the  principal 
source  of  measurement  unreliability  is  rater  inconsistency.  Consequently/ 
while  one  must  be  cautious  in  primary  studies  to  distinguish  between  observer 
consistency  and  score  reliability/  with  che  former  aii  ingredient  of  the 
latter/  in  meta-analyses  the  two  are  the  same. 

The  estimation  of  the  reliability  of  scores  is  crucial  to  the  adequate 
interpretation  of  results.  Clearly/  as  Jackson  (1980)  has  also  pointed  out/ 
in  gathering  data  for  a  review  of  literature/  two  types  of  rater  consistency 
are  of  concern — that  is,  consistency  among  the  raters  coding  the  same  studies 
(inter-rater  agreement)  and  consistency  between  the  same  rater  coding  a 
particr'^ar  study  at  different  points  in  time  (intra-rater  agreement  or 
absence  of  drift).    Both  types  of  reliability  were  addressed  in  this  study. 


Most  of  the  data  collection  for  this  integrative  review  was  conducted  by 
four  raters — the  principal  investigator/  a  collaborator  from  another 
university/  and  two  doctoral-level  graduate  students.  Because  of  the 
difficulties  of  communication  and  because  he  took  on  other  project  taskS/  the 
collaborator  scored  the  fewest  studies  (about  14%  of  the  effect  siz.-s).  The 
principal  investigator  coded  approximately  one-fourth  of  the  effect  sizes, 
while  one  graduate  student  coded  approximately  28  percent  and  the  other  33 
percent  of  the  effect  sizes.  As  will  be  noted  later/  the  two  graduate 
assistants  did  the  codinq  with  the  Prior  Contact  Coding  Sheet  (one  coding 
about  45%  and  the  other  about  55%  of  the  effect  sizes),  and  the  pi:incipal 
investigator  and  one  of  the  graduate  students  did  the  coding  with  the  Contact 
Coding  Sheet. 

As  Orwin  and  Cordray  (1985)  have  pointed  out,  little  attention  has  been 
given  to  how  to  assess  overall  agreement  for  more  than  two  raters.  As  did 
Bullock  and  Svyantek  (1985)  and  Stock  et  al.  (1982),  we  decided  that 
percentage  of  agreement  was  the  most  accurate  statistic  to  use  for  assessing 
inter-rater  and  intra-rater  reliability.  We  recognized,  however,  that  this 
method  of  estimating  rater  reliability  was  not  without  its  problems. 

The  major  difficulty  in  the  use  of  percentage  of  agreement,  from  our 
point  of  view,  is  that  an  overall  estimate  of  high  agreement  might  be 
obtained  despite  substantial  disagreement  on  some  individual  items  (Orwin  & 
Cordray,  1985),  obscuring  the  need  for  corrective  measures  to  enhance 
agreement  (and  validity).  We  countered  that  possibility  by  making  the 
checking  of  both  inter-rater  and  intra-rater  agreement  a  gcoup  process.  That 
is,  with  the  exception  of  the  times  when  the  research  collaborator  was  not 
present  because  he  was  coding  on  another  campus,    all  raters  met  for 


114 


reliability  sessions*  Eacn  rater  in  turn  indicated  how  he  or  she  had  coded 
each  category/  and  the  project  director  filled  in  a  reliability  check  sheet. 
For  intra-rater  reliability  checks/  at  least  one  other  rater  (usually  the 
project  director)  read  from  one  completed  coding  instrument  and  completed  the 
reliability  check  sheet  while  the  rater  for  whom  the  check  was  being 
completed  read  codings  from  the  other  completed  coding  instrument.  This 
procedure  not  only  provided  the  opportunity  to  ascertain  any  serious 
disagireements  on  category  definitions/  but  to  discuss  even  mild  disagreements 
to  enhance  consistency. 

A  rigorous  criterion  for  reliability — 90%  agreement — was  set,  even 
though  a  criterion  of  80%  agreement  is  commonly  used.  The  90%  criterion  was 
particularly  stringent  for  inter-rater  reliability  because  any  categorization 
on  which  two  or  more  of  :..ie  raters  disagreed  was  coded  as  a  disagreement. 
Adopting  Glass's  convention  on  "near  misses"  (Glass  et  al./  1981)/  what  we 
defined  as  "one  space"  disagreements  were  coded  as  agreements. 

Inter-rater  reliability.  It  was  originally  proposed  that  every  tenth 
report/  selected  randomly/  would  be  scored  independently  by  two  of  the  four 
coders  on  a  rotating  basis.  Once  the  project  was  underway/  however/  it  was 
decided  that  all  of  the  raters  would  participate  in  each  inter-rater 
reliability  check  in  order  to  capitalize  on  the  discussions  of  coding 
disagreements  as  a  means  of  enhancing  rater  consistency/  as  noted  in  the 
prior  section. 

An  inter-rater  reliability  check  was  conducted  when  any  one  of  the 
raters  had  completed  approximately  10  reports.  The  number  of  studies  coded 
by  different  raters  during  a  given  period  of  time  varied/  depending  upon 
factors  such  as  report  length/    how  difficult  it  was  to  extract  information 
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from  a  report,  and  the  project  duties  that  took  varying  amounts  of  the 
raters*  time*  As  a  consequence/  at  any  one  inter-rater  reliability  check/  one 
or  more  raters  may  have  coded  slightly  more  than  10  studies  and  one  or  more 
may  have  coded  less  than  10  studies. 

Once  the  coding  instrument  was  judged  to  be  in  its  nearly  final  form 
(early  in  December  of  1985)/  coding  checks  were  begun  and  from  that  time 
until  the  end  of  January  1986/  12  separate  reliability  checks  were  completed. 
The  results  were  as  follows:  88%/  82%/  87%/  91%/  81%/  86%/  91%/  66%/  86%/ 
89.5%/  94%/  and  88%  agreement/  for  the  three  raters  who  were  on  campus  at 
Utah  State  University.  When  the  combined  scores  of  those  three  coders  were 
compared  with  those  of  the  coder  who  was  off-campus  for  the  last  three 
reliability  checks/  the  result  was  95%/  93%/  and  92%  agreement.  At  that 
point/  it  was  decided  to  go  ahead  with  data  collection. 

On  February  14/  1986/  an  inter-rater  reliability  check  for  all  four 
coders  yielded  89%  agreement/  which  was  deemed  close  enough  to  the  90% 
criterion  to  proceed.  (When  coded  by  the  less  stringent  procedure  of 
counting  the  number  of  agreements  and  disagreements  per  category/  instead  of 
scoring  each  category  on  a  dichotomy  of  either  all  four  coders  agreed  or 
disagreed/  there  was  97%  agreement.) 

The  next  inter-rater  agreement  check  was  on  March  6/  1986.  The  three 
on-campus  coders  had  a  90%  agreement;  when  the  combined  scores  of  those  three 
coders  were  compared  with  those  of  the  off-campus  coder/  there  was  98.5% 
agreement. 

On  Marc:i  2^/  a  third  check  was  done.  The  three  on~campus  raters  had  96% 
agreement  (with  91%  perfect  agreements).  There  was  90%  agreement  between  the 
off-campus  rater's  codings  and  the  combined  codings  for  the  three  on-campus 
raters. 
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On  April  15,  an  inter-rater  che':k  for  the  three  on-campus  codars  yielded 
96%  agreement;  on  May  5/  93%  agreement;  and;  on  May  19/  85%  agreement.  By 
April  15/  the  off-campus  rater  was  no  longer  coding  reports/  so  was  not 
included  in  the  checks.  Because  criterion  was  not  met  on  May  19  (85% 
agreement)/  a  second  study  was  coded/  yielding  90%  agreement. 

The  last  inter-rater  reliability  check  on  the  main  coding  instrument  was 
conducted  on  June  4,  1986/  with  94%  agreement. 

Prior  contact  rater  agreement 

The  inter-rater  agreement  rate  was  also  checked  for  the  Prior  Contact 
Coding  Sheet.  Because  the  sheet  was  r/ather  simple  and  straightforward  to 
use/  no  problems  were  anticipated  and  none  were  encountered.  A  check  among 
the  three  on-campus  raters  yielded  a  100%  agreement  before  coding  began. 
Coding  was  then  conducted  by  the  two  graduate  assistants.  Again/  an  inter- 
rater  reliability  check  was  conducted  for  every  10  reports.  All  but  one 
yielded  100%  agreement;  for  the  one,  agreement  was  90%. 

Contact  rater  agreement 

The  distinctions  to  be  made  in  coding  the  Contact  Coding  Sheet  were  more 
subtle/  and  it  was  more  difficult  for  the  principal  investigator  and  graduate 
assistant  to  attain  adequate  inter-rater  agreement.  Eleven  reliability 
checks  were  conducted  over  a  two-week  period  in  January  1987  before 
independent  scoring  commenced.  The  initial  reliability  sessions  also  served 
as  formative  evaluations  of  the  coding  conventions/  although  the  categories 
in  the  coding  instrument  had  stabilized  by  then.  During  and  after  each 
reliability  check  session/  conventions  were  revised  prior  to  further  coding. 
The  percentages  of  agreements  attained  were:    72%/  90%/  72%/  90%/  90%/  86%/ 
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83%,  78%/  89%,  94%,  and  94%.    An  inter-rater  reliability  check  was  again 
conducted  for  approximately  every  10  reports.    Given  the  relatively  small  ^ 
number  of  contact  studies  (N  =  41),  only  two  checks  were  made.    Agreement  was 
88%  for  the  first  one  and  94%  for  the  second. 

Effect  size  accuracy  ^ 
As  an  additional  caution,  because  effect  sizes  are  such  a  central  part 
of  a  quantitative  review,  every  effect  size  was  checked  for  accuracy.  Not 
only  were  computations  redone,  but  the  selection  of  statistics  from  the  ^ 
report  for  computing  each  effect  size  was  reviewed  for  corrections.  Thirty- 
one  errors  were  detected  (and  corrected),  for  an  overall  mean  accuracy  rate 
of  94%.  Taken  by  rater,  one  was  99%  accurate,  two  were  97%  accurate,  and  one 
was  83%  accurate. 

The  lowest  accuracy  rate  was  for  the  off-campus  coder,  which  was  not 
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surprising  considering  the  amount  of  communication  which  took  pla^e  among  the 
on-campus  coders.  Because  it  was  assumed  that  the  purpose  of  coding  was  to 
obtain  the  most  valid  possible  representation  of  the  studies,  the  on-campus 
reviewers  commonly  discussed  with  one  another  difficult  coding  choices  when 
they  were  not  coding  for  a  reliability  check  (that  coding  was,  of  course, 
done  independently).     In  terms  of  accuracy  of  computing  effect  sizes  in 
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particular,    potential  errors  were  occasionally  picked  up  through  this 
interactive  process. 

Our  experience  with  this  study  indicates  that  it  is  feasible  to  use 

i 

raters  who  cannot  communicate  in  person  with  one  another  whenever  they  wish. 
The  success  of  the  off-campus  coder  in  using  the  coding  instrument  is  also 
tentative  evidence  for  the  replicability  of  our  results  (for  the  pitfalls  of  ^ 
data  collection  in  replicating  meta-analyses,   see  Bullock  &  Svyantek,  1985, 
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and  Stock  et  aU,  1982),  Our  experience  also  suggests/  however/  that 
frequent  communication  among  raters  is  a  desirable  feature  of  data  collection 
for  enhancing  inter-r?.ter  agreement  and  reducing  even  a  small  margin  of 
error. 

In tra-rater  reliability.  As  originally  proposed/  after  coding 
approximately  30  reports/  each  rater  receded  one  of  the  reports  at  the 
beginning  of  the  sequence/  with  the  particular  one  selected  by  the  project 
director.  Receding  was/  of  course/  done  without  benefit  of  the  first  coding 
sheet.    Again/  the  criterion  was  90%  agreement. 

Due  to  different  rates  of  coding  reports  (as  noted  above)/  one  rater  had 
three  in tra-rel lability  checks/  one  rater  had  two  intra-rel lability  checks/ 
one  rater  had  one  intra-rel lability  check/  and  one  rater  coded  fewer  than  30 
reports  so  had  no  checks.  For  the  first  rater/  the  percentages  of  intra- 
coder  agreement  were  98%/  96%/  and  93%;  for  the  second  rater/  the  figures 
were  91%  and  92%;  and/  for  the  third  rater/  there  was  96.5%  agreement. 

For  the  Prior  Contact  Coding  Sheets  and  the  Contact  Coding  Sheets/  there 
were  so  few  studies  for  which  prior  contact  was  assessed  that  no  intra- rater 
reliability  checks  were  conducted. 

Summary.  Overall/  the  reliability  of  scoring  was  deemed  to  be  adequate/ 
baspd  both  on  the  percentages  of  agreement  and  the  general  consensus  that 
seemed  obvious  as  the  coders  discussed  studies  during  the  reliability  checks 
and  in  communicating  with  one  another  as  problems  arose  during  the  coding  of 
nonreliability  check  reports 
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Data  Analysis 

As  Glass  and  his  associates  (1981/  pp.  197-200)  have  pointed  out,  the 
role  that  statistical  inference  should  play  in  meta-analyses  is  anything  but 
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clear.  There  are  a  number  of  reasons  foe  not  using  inferential  statistics  in 
an  integrative  review  such  as  reported  here.  The  first  and  perhaps  most 
obvious  is  that  the  data  to  be  analysed  constitute  an  accessible  population/ 
not  a  sample  (although/  of  course/  there  is  always  the  sticky  conundrum/ 
pointed  out  by  Jackson  [1980/  p.  453]  and  Cooper  [1982/  pp.  294^5]/  of 
whether  to  consider  the  primary  studies  being  reviewed  or  the  populations  and 
settings  in  which  the  findings  might  be  applied  as  the  appropriate  target  in 
drawing  conclusions).  Glass  et  al.  (1981/  pp.  199-200)  discussed/  without 
great  clarity/  the  advisability  of  dealing  with  sampling  error  even  when  the 
studies  reviewed  are  considered  to  be  an  accessible  population/  proposed 
inferential  techniques  to  be  used/  and  recounted  the  experience  of  being 
chided  by  Tukey  for  not  presenting  standard  errors  for  mean  effect  sizes. 
Nevertiheiess/  the  use  of  inferential  statistics  t\  analyzing  data  considered 
to  come  from  an  accessible  population  would  appear  to  be  more  a  perpetuation 
of  ritual  than  a  rationally  justified  procedure. 

The  statistical  inference  Zeitgeist  has  amazing  tenacity.  Along  with 
Glass  et  al.  (1981)/  other  books  on  quantitative  review  techniques  have 
sections  on  the  use  of  inferential  statistics  (e.g./  Cooper/  1984;  Hedges  ^ 
Olkin/  1985;  Hunter  et  al./  1982;  Rosenthal/  1984;  and,  Wolf/  1986)/  and 
reports  of  meta-analyses  frequently  contain  inferential  statistics  results. 
The  misplaced  emphasis  on  inferential  statistics  is  nowhere  better 
illustrated  than  by  the  dysfunctional  recommendation  that  statistical  power 
be  increased  by  pooling  the  data  from  primary  studies  (Hedges  &  Olkin/  1986). 

The  overreliance  on  and  misinterpretation  of  inferential  statistics  in 
primary  research  have  been  commented  on  {see,  e.g./  Carver/  1978;  Shaver/ 
1979a/  1980/  1985a/  b).    The  use  of  an  indicator  of  the  significance  of 
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research  results  which  is  dependent  upon  sample  size,  as  statistical 
significance  is,  is  no  more  appropriate  in  analyzing  the  "^Todings  from 
primary  research  reports  in  an  integrative  review  than  it  is  in  primary 
research.  As  Wilson  and  Rachman  (1983)  have  pointed  out,  there  is  with  meta- 
analyses, as  in  primary  research,  the  ^'danger  that  [the  use  of]  sophisticated 
statistical  techniques  [will]  serve  [to]  obscure  damaging  flaws  in  the 
evidence"  (p.  55). 

In  this  study,  we  attended  to  the  prior  criticisms  of  the  misuse  of 
inferential  statistics  to  analyze  data  in  primary  studies  and  followed  the 
logic  of  the  irrelevance  of  inferential  statistics  with  population  data.  The 
basic  analytic  approach,  therefore,  was  descriptive.  Glass's  (1976) 
recommendation  was  accepted  that  variables  which  might  have  moderated 
treatment  effects  be  investigated,  an  approach  which  can  be  recast  as 
determining  to  what  extent  treatment  effects  appear  to  be  nested  within  other 
variables,  such  as  the  age-level  of  subjects  and  the  type  of  research  design. 
Considerable  effort  was  put  into  attempting  to  disentangle  treatment 
techniques  from  study  r'  -racteristics  in  order  to  determine  what  conclusions 
about  treatment  effects  might  be  legitimately  drawn  and,  conversely,  to 
determine  what  confounding  factors  might  be  accounting  for  the  inconsistent 
results  referred  to  in  prior  reviews. 

As  with  tests  of  statistical  significance,  we  also  eschewed  other 
inferential  procedures  such  as  tests  of  homogeneity  of  effect  sizes  (Hedges, 
1982),  the  estimation  of  effect  size  from  a  series  of  independent  experiments 
(Rose'ithal  &  Rubin,  1982),  adjustments  to  increase  the  accuracy  of  effect 
sizes  (Hedges,  1981)  (which  in  the  experience  of  the  Early  Intervention 
Research  Institute  staff  at  Utah  State  University  tend  to  produce  negligible 
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and  practically  unimportant  adjustments  [also  see  Bangert-DrownSf  1986]),  and 
weighting  procedures  for  minimizing  cumulated  effect  size  variances  (Bangert- 
Drowns,  1986;  Green  &  Hall;  1984). 

Basic  descriptive  statistics  were  computed — means,  modes,  medians, 
standard  deviations,  and  ranges — and  frequency  distributions  are  sometimes 
reported,  too,  in  the  Results  section.  Two  and  three-way  frequency  tables 
were  used  to  illustrate  possible  interactions  between  variables,  especially 
treatment  techniques  and  study  characteristics.  The  approach  was  to 
demonstrate  how  the  characteristics  of  the  studies  in  our  accessible 
population,  particularly  the  treatment  techniques,  were  related  to  the  size 
of  effects. 

Index  of  triviality.  Even  with  an  accessible  population  of  studies  and 
without  the  use  of  inferential  statistics,  the  researcher  is  faced  with 
issues  of  inference.  For  example,  how  does  one  decide  how  large  a  difference 
between  mean  Ds  (e.g.,  the  mean  Ds  for  two  treatmei.^s  or  for  two  categories 
of  study  quality)  must  be  in  order  to  be  considered  "important"?  The 
question  is,  of  course,  aaalogous  to  the  question  of  how  to  determine  if  the 
result  of  a  piece  of  research,  expressed  as  a  difference  between  posttest 
means,  is  of  practical  significance.  And  our  solution  was  analogous  as  well, 
although  our  initial  question  was  somewhat  different.  It  was,  at  what  point 
con  Id  we  consider  a  difference  in  mean  effect  sizes  to  be  so  trivial  that  we 
would  be  justified  in  not  paying  further  attention  to  it?  Such  an  index  of 
triviality  is  not  the  clear  converse  of  an  index  of  importance.  That  is,  a 
difference  below  the  index  would  be  judged  clearly  trivial,  but  one  greater 
than  the  index  would  not  necessarily  be  of  practical  importance. 

In  addressing  the  question  of  practical  importance  and  triviality,  we 
treated  mean  Ds  as  we  would  treat  the  means  for  data  collected  on  Ss  in  a 
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primary  piece  of  research:  we  computed  an  effect  size  by  subtracting  one 
mean  from  the  other  and  dividing  by  a  standard  deviation.  In  this  case:  the 
means  were  population  values/  as  was  the  standard  deviation.  Consequently, 
the  use  of  Cohen's  (1977)  symbol/  d,  to  represent  the  effect  size  is 
appropriate/  because  he  defines  that  effect  size  as  a  population  parameter 
(pp.  9,  20). 

Because  we  had  only  one  standard  deviation  to  deal  with — the  population 
parameter;  whi,.n  was  .61* — it  was  not  necessary  to  compute  a  d  for  each 
comparison.  Once  a  minimum  d^  was  established  for  purposes  of  judging 
triviality/  it  could  be  determined  how  large  a  difference  in  means  had  to  be 
in  order  to  attain  that  value. 

Setting  such  a  convention  is  a  dubious  process.  Cohen  (1977/  p.  12)  has 
cautioned  that  establishing  conventions  for  judging  when  an  effect  size  is  of 
acceptable/  or  unacceptable/  magnitude  is  as  arbitrary — and  as  potentially 
usef  >cl/  yet  subject  to  misuse — as  the  use  of  the  common  .05  criterion  for 
statistical  signiiTicance.  Glass  et  al.  (1981/  p.  104)  argued  strongly 
against  the  establishment  cf  conventions  by  which  to  label  regions  of  an 
effect  size  metric  with  such  adjectives  as  "small"/  "moderate"/  or  "large". 
As  they  correctly  noted/  an  effect  size  can  only  be  interpreted  meaningfully 
in  context/  that  is  in  terms  of  the  benefits  to  be  achieved  given  the  cost  of 
producing  the  result  (also  see,  Shaver/  1985a/  b). 

It  should  be  noted  that  the  arguments  of  Cohen  (1977)  and  Glass  et  al. 
(1981)  are  directed  against  the  setting  of  conventions  that  would  be  used 
across  primary  research  studies  or  integrative  reviews/  as  has  been  the  case 


*This  is  the  standard  deviation  for  the  population  of  studies  that 
constituted  the  main  analyses  reported  in  Chapter  6.  It  is  nearly  identical 
to  the  standard  deviation  of  .62  for  the  705  Ds  in  the  total  data  set. 
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with  the  .05  level  of  statis  cal  significance.  In  a  particular  piece  of 
research/  such  as  our  review  of  the  research  on  modifying  attitucSes  toward 
persons  with  disabilities/  one  must  address  the  matter  of  a  minimum 
interpretable  effect  size  either  directly  or  through  implicit/  unarticulated/ 
standards.  We  chose  the  former/  but  the  answer  was  not  easy  to  coine  by,  even 
following  Cohen's  one  guideline  for  such  decisions/  "that  they  not  be 
unreasonable"  (p.  12). 

Setting  the  standard  for  a  minimum  effect  size  is  particularly  difficult 
when  it  is  not  clear  what  benefits  are  indicated  by  different  values  of  a 
dependent  measure.  This  is  a  particular  problem  in  the  attitudes  toward 
disabled  persons  research,.  Data  are  not  available  that  allow  the  valid 
formulation  of  clear  expectations  as  to  the  effects  of  a  difference  in  scores 
on  an  assessment  such  as  the  Attitudes  Toward  Disabled  Persons  Scale  (Yuker 
et  al./  1970)  which  was  widely  used  in  the  primary  studies.  How  different 
must  their  scores  be  before  Ss  will  differ  in  their  thoughts  the  next  time 
they  encounter  a  person  with  a  disability  or,  more  importantly/  will  behavt 
differently  toward  a  person  with  a  disability  or  toward  other  nondisabled 
persons  who  make  stereotypic  statements  about  disabilities  or  the  persons  who 
have  them?  Or,  how  different  must  such  scores  be  to  indicate/  if  they  ever 
will/  that  the  Sc  will  take  different  stances  on  public  policies  that  affect 
educational/  economic/  and  social  equity  for  persons  with  disabilities?  In 
the  absence  of  empirically-based  answers  to  such  questions/  establishing  a 
minimum  effect  size  for  interptatability  is,  at  best/  a  loosely  bounded 
guessing  game/  albeit  it  a  necessary  one. 

We  proceeded  by  using  Cohen's  (1977)  standard  for  a  "small"  effect  size 
(pp.   25-26)/   d^  =  .2/    as  a  starting  point.     With  a  population  standard 
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deviation  of  .62,  a  mean  difference  of  .12  would  yield  a  d  of  .2.  The 
corresponding  rp^  is  .10^  an  obviously  small  correlation  that  suggests  an 
obviously  small  amount  of  common  variance  (i.e.,  =  .oi  or  1%).  Overlap 
of  distributions  is  another  corollary  to  £  as  an  indicator  of  effect  size 
magnitude  (e.g.,  Cohen,  1977;  Glass  et  al.,  1981).  Cohen  (p.  22)  indicates, 
for  example,  that  with  a  ^  =  .20/  the  overlap  of  distributions  is  85.3 
percent  (14.7%  nonoverlap).  This  interpretation  depends,  however,  on  the 
assumption  that  the  distributions  are  normal/  an  assumption  not  tenable  with 
some  of  our  data. 

The  same  line  of  reasoning  was  applied  to  the  analysis  of  the  variance 
ratios  computed  when  variances  were  available  for  the  treatment  groups  (see 
Chapter  6).  For  the  453  variance  ratios  that  could  be  computed  for  the  644 
effect  sizes  used  in  the  main  analyses,  the  standard  deviation  was  .91  (with 
a  mean  of  1.13,  very  close  to  the  value  for  equal  variances).  To  yield  a  d 
of  .2,  a  difference  in  mean  variance  ratios  would  have  to  be  .18  (the  same  as 
obtained  using  the  standard  deviation  of  .88  for  the  705  effect  sizes)/  our 
index  of  triviality  for  variance  ratios. 

It  did  not  seem  likely  that  anyone  would  argue  that  we  would  be 
overlooking  important  differences  with  a  difference  in  mean  Ds  of  .12  or  in 
mean  variance  ratios  of  .18  as  an  index  of  triviality.  Also,  those  indices 
were  consistent  with  our  earlier  intuitive  judgments,  in  inspecting  data, 
about  mean  differences  that  K^rdly  seemed  worth  attending  to.* 

*Cohen  (1977)  specifies  d  =  .5  as  a  medium  effect  size  and  d^  =  .8  as  a  large 
effect  size.  If  applied  to  our  data  set,  a  difference  in  mean  Ds  would  have 
to  be  .30  to  be  "moderate"  by  that  criterion  and  .49  to  be  "high";  a 
difference  in  mean  variance  ratios  would  have  to  be  .45  to  be  a  "moderate" 
effect  size  and  .73  to  be  "large".  If  a  commonly  used  criterion  of 
practical  significance,  d  =  1.00,  were  used,  the  difference  in  mean  £s 
would,  of  course/  have  to  15e  .61  and  the  ditference  in  mean  variance  ratios, 
.91. 
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Minimum  ns>  Another  question  that  had  to  be  addressed  was  how  many  Ds 
must  a  result  be  based  on  before  it  would  be  considered  sufficiently  stable 
for  interpretation.  That  is,  just  as  replications  are  important  to 
establishing  the  stability  of  a  finding  m  primary  research/  so  is  it 
important  in  conducting  a  review  of  research  not  to  over-interpret  findings 
that  may  not  be  stable.  Even  when  dealing  with  a  population  of  studies/  the 
confidence  one  has  in  a  result  must  be  based  on/  among  other  things/  the 
number  of  data  points  on  which  it  is  based.  With  few  guidelines  to  go  by,  it 
was  decided  that  a  mean  £  had  to  be  based  on  at  least  10  Ds  to  be  considered 
sufficiently  stable  for  inter^ jetation.  Although  our  principal  set  of 
primary  research  reports  had  a  mean  of  3.3  effect  sizes  per  report/  the 
median  and  mode  were  both  2.  Consequently/  it  can  be  anticipated  that  a  mean 
D  based  on  10  Ds  will  typically  reflect  data  from  approximately  5  separate 
studies. 

Our  criterion  for  multiple  effect  sizes  should  not  be  confused  with  a 
demand  for  replications/  in  the  sense  of  either  repeated  observations  on 
randomly  assigned  units  or  planned  repetitions  of  studies.  To  confuse  the 
presence  in  the  literature  of  several  studies  on  a  topic  with  replication  as 
a  planned  research  strategy/  as,  for  example/  Bangert-Drowns  (1986/  p.  398) 
and  Jackson  (1980/  p,  445)  seem  to  have  done,  would  be  an  error.  We  .<vimply 
sought  a  mini''ium  sense  of  stability/  which  is  significantly  different  from 
addressing  the  congruence  among  studies  implied  by  the  replication  termxnology. 

Binomial  effect  size  display^  It  was  anticipated  before  the  study  began 
that  the  binomial  effect  size  display  (BESD)  (Rosenthal/  1984;  Rosenthal  Sc 
Rubin/  1982)  might  be  ?  helpful  interpretive  device.  Using  the  BESD  involves 
displaying  results  in  a  2-by-2  frequency  table  according  to  treatment 
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condition  and  success  rate.  The  example  used  to  demonstrate  application  of 
the  BESD  has  to  do  with  the  numbers  of  persons  who  are  alive  or  dead 
following  exposure  to  a  treatment  or  a  control  condition.  In  such  a  case, 
^'success"  is  obviou*^. 

As  the  earlier  discussion  of  the  meaning  of  attitude  scores  in  arriving 
at  an  index  of  triviality  indicated,  establishing  "success"  and  "failure"  is 
not  so  simple  a  matter  in  attitude  modif ic^.tion  research*  Preece  (1983)  has 
noted  that  the  use  of  a  median  split  to  effect  the  necessary  dichotomy  is 
meaningless  in  many  cases.  Ours  is  one  of  them.  With  no  feasible  means  of 
arriving  at  a  valid  success  rate,  the  BEMD  turned  out  not  to  be  useable  for 
our  interpretative  purposes. 

Data  Availability 

As  noted  earlier,  lists  of  the  studies  that  were  coded,  as  well  as  those 
discarded,  are  contained  in  Appendices  D  and  E.  Descriptive  data  for  the 
studies  coded  are  included  in  Appendix  F.  The  coding  instrument  and 
conventions  for  coding  are  in  Appendices  B  and  C.  In  addition,  the  complete 
data  set  is  on  magnetic  tape  and  can  be  obtained  at  cost  from  the  project 
director. 

Summary 

In  this  chapter,  the  procedures  followed  in  conducting  our  integrative 
review  have  been  spelled  out,  with  considerable  detail  in  regard  to  the 
obtaining  of  primary  research  reports  and  the  coding  instrument  used  for  data 
collection.  The  outlines  of  our  approach  to  data  analysis  have  been 
sketched.  The  details  will  be  filled  in  as  some  information  about  our 
population  of  studies  is  presented  in  the  next  chatter,  folic -'id  by  the 
results  of  analyses  of  our  data. 
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CHAPTER  4 

SOME  ATTRIBUTES  OF  THE  POPULATION  OF  STUDIES 

The  major  concern  in  conducting  this  integrative  review  of  the  research 
on  modifying  attitudes  toward  persons  with  disabilities  was,  of  course,  to 
answer  the  question,  what  does  an  analy^sis  of  the  population  of  research 
studies  indicate  about  the  effectiveness  of  different  techniques  for 
modifying  attitudes  toward  persons  with  disabilities?  To  set  the  context  Jor 
characterizing  the  relevant  findings  from  the  primary  research  reports  that 
we  located,  we  present  in  this  chapter  some  general  information  about  the 
population  of  studies  that  we  coded  and  analyzed. 

Numbers  of  Studies  and  Reports 
As  noted  in  Chapter  3,  273  primary  research  studies  were  coded  for  this 
integrative  review.  The  number  of  reports  coded  was  actually  higher  (N  = 
303)  because  any  piece  of  research  reported  in  multiple  sources  was  coded  as 
one  study.  For  example,  Don.aldson's  (1974)  dissertation  research  was  also 
reported  in  journal  articles  she  authored  (Donaldson,  1976)  and  coauthored 
(Donaldson  (x  Martinson,  1977).  Donaldson's  study  was  coded  only  once,  based 
on  all  three  reports,  and  contributed  only  once  to  our  list  of  273  studies. 

Multiple  reports  of  studies  were  most  common,  as  they  should  be*  with 
dissertations  and  convention  papers.  Dissertations  jr  thesis  research  was 
reported  in  33  journal  articles,  as  well  as  in  two  convention  papers,  one 
v^gency  report,  and  one  book.  Similarly,  6  stiidies  reported  in  convention 
papers  and  3  reported  in  agency  reports  were  also  available  in  journal 
articles.  Publication  of  the  same  research  in  different  journal  articles 
was,  as  is  to  be  expected,  infrequent.    Three  sets  of  articles  in  different 
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journals  were  based  on  the  same  data  (Dye,  1978,  1980;  Donaldson,  1976; 
Donaldson  &  Martinson,  1977),  including  a  follow-up  study  (Espositc  &  Peach, 
1983;  Esposito  &  Reed,  1986). 

The  273  studies  yielded  725  effect  sizes  (see  Table  7 — 644+61+20),  as 
well  as  182  results  for  which  information  to  compute  an  effect  size  was  not 
available,  but  information  on  statistical  significance  was.  A  total,  then,  of 
907  effect  sizes  and  "no  information"  results  were  coded. 

Partitions  in  the  Data  Set 

Table  8  presents  further  information  by  type  of  comparison,  with  the 
Mainstrec^m  and  the  No  Information  comparisons  omitted.  That  omission 
reflects  early  analysis  decisions.  Different  categories  were  used  to  code 
the  treatment  in  studies  of  attitude  change  with  classroom  mainstreaming 
(although  studies  which  included  other  types  of  in-school  contact,  such  as 
having  a  self-contained  special  education  classroom  in  the  school,  were 
analyzed  using  the  main  coding  instrument),  and  it  was  decided  to  analyze 
those  effect  sizes  separately.  Also,  the  No  Information  reports  lacked  basic 
information,  but  were  analyzed  separately,  primarily  to  determine  if  studies 
lacking  the  information  to  compute  effect  sizes  differed  otherwise  from  the 
effect  size  reports. 

The  other  categories  in  Tables  7  and  8  represent  early  coding  deciiiions. 
The  single-group,  pre-posttest  design  is  of  dubious  quality.  Studies  using 
that  design  were  coded,  however,  because  there  seemed  to  be  a  large  number  of 
them  and  it  was  thought  important  not  to  lose  the  potential  of  garnering 
information  from  those  results. 
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Table  7 

Numbers  of  Attitude  Modification 
Effect  Sizes  and  Studies 


• 

Effect  sizes 

Type  of 
Comparison 

Number 

Number  of 
Studies 

• 

T  vs.  C 
T  vs.  P 

497 
49 

# 

Pre-post 
Subtotal 

98 
544 

200 

A  vs.  B 

61 

20 

Mainstream 

20 

10 

No  information 
results 

182 

53 

Total 

907 

283(273)* 

Note:  T  X  C  =  treatment-control  group 
comparison;  T  X  P  =  treatment  by  placebo 
group  comparison;  pre-post  =  single-group/ 
pretest-posttest  mean  comparison;  A  vs.  B 
=  comparison  of  two  treatment  groups/ 
Mainstream  =  comparison  from  a  study  of 
the  effects  of  iu"*.instreamed  classrooms;  No 
information  =  statistical  significance 
information  available/  but  not  information 
for  computing  effect  size. 

*Because  a  study  vould  yield  more  than  one 
type  of  comparison/  the  total  of  283 
exceeds  the  number  of  studies  actually 
coded/  i.e./  273. 
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Table  8 


Types  of  Comparisons;    Effect  Size  NS/  MeanS/  and  Standard  Deviations 

for  705  Effect  Sizes 


Effect  Sizes  (Ds) 


Comparison  Type  N  Mean       Median       s.d.  Range 


T  vs.  C 

499 

.36 

.32 

.57 

-1.42 

to 

4.40 

T  vs.  P 

49 

.29 

,23 

.70 

-1.22 

to 

2.42 

Pre- post 

97 

.50 

.33 

.72 

-1.61 

to 

3.11 

A  vs.  B 

60 

.15 

.06 

.71 

-1.07 

to 

3.55 

Total 

705* 

.36 

.62 

-1.61 

to 

4.40 

*An  effect  size  was  discarded  and  the  Comparison  Type  categorization 
of  two  effect  sizes  were  changed  during  later  analyses  upon  which 
Table  7  is  based^  consequently,  the  numbers  in  this  and  later 
tables  will  not  match  perfectly  with  those  in  Table  7. 
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By  the  same  token/  it  seemed  important  to  distinguish  between  "control" 
and  "placebo"  groups/  terms  misused  by  many  researchers  in  discussing 
research  designs.  A  "control"  group  is  one  which  receives  no  treatment/ 
while  a  "placebo"  group  (sometimes  referred  to  as  a  type  of  control 
vreatment)  receives  attention  and/or  materials  that  are  assumed  to  be  inert 
(i.e./  are  assumed  to  have  no  affect  per  se  on  the  dependent  variable)/  but 
in  all  other  ways  (e.g«/  time/  amount  of  experimenter  attention)  comparable 
to  the  experimental  treatment  {see,  e.g./  Borg  &  Gall/  1983/  pp.  221-22/ 
355).  It  seemed  likely  that  control  and  placebo  groups  might  react 
differently  due  to  the  differences  between  receiving  no  treatment  and  some, 
even  if  supposedly  inert/  treatment.  Consequently/  we  coded  which  of  the 
designs  was  involved  in  a  treatment  comparison. 

Analysis  Base 

It  was  further  decided  that  the  comparison  base  for  an  attitude  change 
treatment  would/  when  available/  be  the  absence  of  treatment — i.e./  a  control 
or  placebo  condition* — rather  than  another  treatment.  With  data  from  a 
number  of  treatment  comparisons/  it  would  be  difficult  to  illustrate  and  keep 
evident  what  had  been  compared  across  effect  sizes:  data  based  on  comparing 
each  treatment  against  an  absence  Ol  treatment  to  obtain  an  effect  size  are 
more  amenable  to  interpretation.  Consequently  when  two  treatment  groups 
(i.e./  Treatment  A  and  B)  were  present  in  a  study  and  each  was  compared  with 
a  control  or  placebo  group/  eff€::t  sizes  were  computed  and  coding  conducted 
for  the  treatment  versus  control  (T  vs.  C  in  Tables  7  and  8)  or  treatment 


*The  single-group/  pre-post  design  is,  of  course/  a  weak  form  of  this 
comparison  with  the  pretest  serving  as  an  indication  of  attitudes  in  a  no- 
treatment/  control  situation. 
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versus  placebo  (T  vs.  P)  comparisons/  and  not  for  the  Treatment  A  versus 
Treatment  B  (A  vs.  B)  comparison.  Treatment  A  versus  Treatment  B  (A  vs.  B) 
effect  sizes  were/  however,  computed  and  coded  when  aTXCorTXP  effect 
size  was  not  available.  Our  later  difficulties  in  untangling  the  A  vs.  B  Ds 
which  were  coded  confirmed  the  soundness  of  th3  decision  not  to  code  them 
generally. 

Also,  as  can  be  seen  in  Table  3/  A  vs.  B  comparisons  yielded  Ds  that 
were,  on  the  average,  different  than  those  from  T  vs.  C  and  T  vs.  P 
comparisons.  As  would  be  expected  when  comparing  differing  treatments  rather 
than  treatment  against  lack  of  treatment,  A  vs.  B  comparisons  yielded  the 
smrillest  mean  £,  .15,  barely  above  the  standard  we  set  for  trivial  results 
— .12  (see  Chapter  3).  The  lower  median  D  of  .06  is  probably  a  better 
indicator  of  central  tendency,  as  two  outliers  (£s  of  2.71  and  3.55,  with  the 
next  lowest  D,  1.26)  distorted  an  otherwise  near-symmetrical  distribution. 
Also,  sizeable  differences  between  the  pre-post  mean  effect  size  mean  (.50) 
and  the  T  vs.  C  (.36)  and  T  vs.  P  (.29)  effect  size  means  (differences  of  .14 
and  .21,  respectively)  look  trivial  when  the  median  is  used  as  the  measure  of 
central  tendency. 

Most  of  the  reporting  of  our  data  analyses  will  be  based  on  the  644  T 
vs.  C,  T  vs.  P,  and  pre-post  Ds«  However,  the  description  of  the  data  set 
that  follows  includes  A  vs.  B  comparisons  as  well  so  as  to  provide  a  more 
complete  picture  of  the  available  body  of  research  on  modifying  attitudes 
toward  disabled  persons. 

Types  Oi.  Reports 

One  aspect  of  integrative  reviews  that  has  attracted  interest  is  the 
types  of  research  reports  that  were  coded.    As  is  evident  in  Table  9,  our 


ERIC  ^'^^  134 


Table  9 


Ds  by  Type  of  Report 
for  705  Effect  Sizes 


Effect 

Sizes  (Ds) 

Report  Type 

N 

Mean 

Median 

S.D. 

Journal  Article 

141 

.59 

.42 

.69 

Dissertation 

430 

.24 

.22 

.60 

Thesis 

13 

.31 

.30 

.29 

Convention  Paper 

13 

.37 

.38 

.40 

Unpublished  Report 

30 

.37 

.36 

.38 

Combination 

78 

.54 

.44 

.61 

Total 

705 

.36 

.36 

.62 

15i) 

7 
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major  source  of  effect  sizes  was  dissertations  (N  =  430),  with  journal 
articles  second  (N  =  141  )•  Because  a  journal  article  was  included  in  most  of 
the  "combination"  sources  (e.g.,  effect  sizes  available  from  a  dissertation 
and  a  journal  article),  the  number  of  journal  art icle-based  Ds  is  close  to 
200. 

Of  greater  interest  than  the  frequencies  of  Ds  by  type  of  publication 
are  the  mean  Ds.  They  support  the  observation  by  Bryant  and  Wortman  (1985), 
as  well  as  by  Bangert-Drowns  (1986)  and  Walberg  (1986) — based  on  Smith 
(1980) — that  published  articles  have  consistently  higher  effect  sizes  than  do 
dissertations  and  other  unpublished  sources.  The  mean  D  for  journal  articles 
is  .59  and  for  dissertations,  .24;  the  difference  of  .35  yields  a  d  of  .56, 
using  the  population  standard  deviation  of  .62  (see  Analysis  section  of 
Chapter  3).  The  mean  for  Ds  from  combined  sources  (which,  again,  are  largely 
those  Ds  that  came  from  dissertations  and  other  unpublished  reports  from 
which  articles  acceptable  for  publication  were  drawn)  is  .54,  almost 
identical  to  the  mean  D  for  articles.  Dissertations  have  the  lowest  mean 
D,  .24.  That  mean  is  within  .12  (our  standard  for  a  trivial  difference;  see 
Analysis  section  of  Chapcer  3)  of  the  thesis  mean  D  (.31),  but  is  barely  more 
than  .12  lower  than  the  means  for  convention  papers  and  unpublished  reports 
(mean  Ds  for  both  =  .37).  The  picture  does  not  change  much  if  the  median  Ds, 
which  are  less  influenced  by  outlying  Ds,  are  inspected.  Interestingly,  the 
dispersions  are  similar  for  journal  article,  combination  source,  and 
dissertation  Ds  (s.d.  =  .69,  .61,  and  .60,  respectively),  with  the  Ds  from 
the  other  three  types  of  publications  having  lower  and  fairly  similar 
standard  deviations. 

The  potential  for  a  distorted  picture  of  research  findings  in  a  review 
based  only  on  published  reports  is  obvious  from  Table  9.     Whether  the  higher 
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D  for  journal  articles  is  a  reflection  of  the  type  of  designs  in  studies  that 
are  submitted  and  accepted  for  publication  is  addressed  in  part  by  Table  10. 
The  higher  mean  £s  for  published  articles/  as  contrasted  with  dissertations/ 
are  clearly  evident  across  all  types  of  comparisons.  Journal  meon  Ds  are 
also  higher  than  those  for  the  other  report  types  except  for  convention 
papers  and  unpublished  repcLts  based  on  treatment  by  control  (T  vs.  C) 
designs. 

It  is  noteworthy  that  treatment  by  control  designs  were  more  frequent 
for  dissertation  effect  sizes  than  would  be  expected  based  on  marginal 
totals/  and  less  frequent  than  expected  for  journal  article  effect  sizes.  At 
the  same  time/  single-group/  pre-post  designs  were  more  frequent  than 
expected  for  journal  article  effect  sizes  and  less  frequent  than  expected  for 
dissertations.  (Many  of  the  journal  articles/  we  found/  were  reports  of 
"convenience"  researcn  in  which  students  in  a  college  course  were  pre-  and 
posttested.)  Similarly/  while  the  mean  £  for  T  vs.  C  results  (.49)  published 
in  journals  was  lower  than  the  journal  Pre-post  mean  (.65;  1.05  for 
combination  sources)/  the  reverse  was  true  for  dissertations  (.31 
versus  .05).  Particularly  striking  as  well/  given  the  mean  overall  D  for  A 
vs.  B  studies  of  .15  is  the  mean  D  for  the  eight  journal  article  effect 
sizes — 1.19.  That  is  in  sharp  contrast  to  the  mean  D  of  -.03  for 
dissertation  A  vs.  B  effect  sizes.  The  data  in  Table  10  suggest  rather 
strongly  either  a  predilection  to  submit  research  for  publication  only  when 
the  results  are  striking  (i.e./  a  large/  statistically  significant 
difference)/  or  to  accept  only  such  reports  for  publication.  There  does 
appear  to  be  a  bias  against  the  importance  of  confirming  the  lack  of  an 
effect. 
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Table  10 


Ds  by  Type  of  Report  and  Type  of  Co'^  pari  son 
for  705  Effect  Sizes 


Comparison  Type 


Report  Type  T  vs.  C       T  vs.  P       Pre-post       A  vs.  B  Totals 


Journal  Article 

.49 

.68 

.65 

1.19 

.59 

78^ 

12 

43" 

8 

141 

Dissertation 

.31 

.11 

.05 

-.03 

.24 

332^ 

32 

20^ 

46 

430 

Thesis 

.31 

.31 

.31 

11 

2 

0 

0 

13 

Convention  Paper 

.48 

.31 

.37 

5 

0 

8 

0 

13 

Unpublished  Report 

.46 

.27 

.37 

17 

0 

13 

0 

30 

Combination 

.46 

.62 

1.05 

.14 

.54 

56 

3 

13 

6 

78 

Total 

.36 

.29 

.50 

.15 

.36 

499 

49 

97 

60 

705 

Note.  The  top  number  is  the  mean  D;  the  bottom  number  is  the  frequency 
of  D^s. 

^At  least  10  less  than  expected/  based  on  marginal  totals. 
'^At  least  10  more  than  expected/  bcsed  on  marginal  totals. 
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Research  Over  Thirty  Years 

The  amount  of  research  in  an  applied  field/  such  as  the  modification  of 
attitudes  toward  disabled  persons,  is  likely  a  function  of  a  number  of 
factors,  including  public  interest,  especially  as  reflected  in  federal 
funding  for  the.  particular  research  area.  The  growth  in  research  in  this 
field  has  been  great  since  the  1955^60  period.  As  indicated  by  the  total 
numbers  of  effect  sizes  we  coded  (see  Total  Effect  Sizes  column  in  Table  11), 
there  was  a  2700  percent  growth  from  the  1955-60  period  to  the  1971-75 
period— from  3  effect  sizes  in  1955-GO  to  91  effect  sizes  in  1971-75.  (With  a 
mean  of  3,  and  a  median  [and  mode]  of  2  effect  sizes  per  study,  that  is  a 
growth  from  approximately  1  or  2  reported  studies  to  approximately  46.)  That 
percentage  of  change  is  large  because  the  base  is  so  small.  What  catches  the 
eye  is  the  over  200  percent  growth  from  1971-75  to  the  next  time  period, 
1976-80 — from  91  effect  sizes,  or  about  46  reported  studies,  to  286  effect 
sizes,  or  about  143  studies. 

Keeping  in  mind  the  time  lags  between  various  steps  in  the  production  of 
applied  research — expression  of  public  conc?m,  legislative  appropriations, 
program  announcements,  funding  of  projects,  proposal  submissioii  and  approval, 
completion  of  research,  and  report  preparation  and  publication — an  upsurge  in 
concern  for  persons  with  disabilities  seems  evident  in  Table  11.  (Note  that 
the  overall  picture  of  growth  in  research  portrayed  by  the  effect  size  data 
in  Table  11  is  consistent  with  the  report  data  in  Figure  1,  Chapter  2.)  Why 
the  drop  in  number  of  effect  sizes  from  1976-80  to  1981-86?  It  may  be  an 
artifact  of  the  difficulty  in  locating  studies  which  are  not  yet  well-cited 
in  the  reference  lists  of  reviews  and  othe^"  research  reports,  or  it  may 
reflect  a  decline  in  interest  in  this  area  of  research. 
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Table  11 

Frequencies  of  Types  of  Comparisons  d>  Year  of  Publication 

for  705  Effect  Sizes 


Type  of  Comparrson 

Total 

Year 

C       T  vs. 

P 

P re-Post 

A  vs.  B 

Effect  Sizes 

1955-60 

2 

0 

1 

0 

3 

1961-65 

13 

0 

8 

2 

23 

1966- 10 

A  A 

44 

0 

C 

D 

63 

1971-75 

62 

4 

17 

8 

91 

1976-80 

228^" 

11 

26^ 

21 

286 

1981-86 

150^ 

34"^ 

57 

24 

235 

Total 

499 

49 

93 

60 

701= 

Note:     Data  are  effect  size 

(D) 

frequencies. 

Cramer's 

V  =  .16. 

^At  least  10  less  thaii  expected/  based  on  marginal  totals. 

At  least  10  more  than  expected/  based  on  marginal  totals. 
^Rowe  &  Smith/    in  press/    had  4  effect  sizes  that  are  not 

included  in  this  table. 
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Types  of  Comparisons 

Given  general  concerns  about  research  design /  it  is  of  interest  whether 
the  designs  used  in  disabilities  attitude  modification  research  have  changed 
over  the  years.  Table  11  indicates  little  relationship  between  year  of 
publication  and  type  of  comparison  for  the  ef  L^ct  sires  in  our  set.  A  low 
Cramer's  V  (.16)  reflects  a  relative  increase  i'-.  ti:eatment-control  (T  vs.  C) 
comparisons  reported  in  1976-78,  an  increase  in  treatment-placebo  (T  vs.  P) 
comparisons  reported  in  1981-86/  relatively  fewer  pre-post  comparisons 
reported  in  1976-80,  and  fewer  treatment-control  comparisons  in  1981-'86,  than 
expected  given  marginal  totals. 

Attitude  Modification  Techniques 

Have  researchers  concerned  with  attitudes  toward  disabled  persons 
directed  their  attention  to  different  attitude  modification  techniques  over 
the  years?  The  type  of  modification  techniques  used  in  the  research  which 
yielded  our  Ds  changed  somewhat  over  the  31-year  span  for  which  we  located 
reports  (see  Table  12).  Perhaps  of  most  interest,  compared  to  the 
frequencies  expected  based  on  marginal  totals,  are  the  dramatic  increase  in 
the  number  of  information  technique  effect  sizes  for  1976-80  period,  the 
decrease  in  number  of  contact  effect  sirres  reported  in  1981-86,  the  spate  of 
vicarious  experience  effect  sizes  from  the  1976-80  reports,  followed  by  a 
decline  in  1981-86  which  was  accompanied  by  a  jump  in  vicarious  experience 
plus  information  effect  sizes  in  1981-86.  The  rather  dramatic  increase  .in 
effect  sizes  for  which  a  combination  of  techniques  was  used  (the  Other 
category)  in  1976-86  iS  also  of  interest,  perhaps  indicating  efforts  tc  go 
beyond  conventional  modes  of  thought  in  regard  to  modifying  attitudes  toward 
persons  with  disabilities. 
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Tahxe  12 


Frequencies  of  Types  of  Modification  Techniques  by  Year  of  Publications 
(T  X       T  X       Pre-posc.  \  vs.  8  effect  sizes  included) 


Type  of  Techni-que 


4^ 
to 


Information 

Enformation 

Vicarious 

+  Vicarious 

Posi tive 

Persuasive 

Systematic 

Total 

Jear 

Information 

Contact 

+  i-ontact 

Experience 

Experience 

Reinforcement 

Message 

Desensitizat  ion 

Other 

Erfect  Sizes 

1955-60 

1 

2 

0 

0 

0 

0 

Q 

0 

0 

3 

1961-65 

U 

6 

4 

0 

0 

0 

0 

0 

2 

23 

L965-70 

7^ 

9 

0 

0 

6 

0 

0 

63 

1971-75 

28 

11 

27D 

1 

0 

3 

3 

4 

14 

91 

1975-30 

107*^ 

34 

36 

35^ 

13^ 

0 

22 

11 

28^ 

285 

1981-86 

77 

21^ 

20^ 

13 

52b 

0 

3 

0 

49b 

235 

Total 

23i 

95 

107 

53 

65 

3 

34 

15 

93C 

701^ 

Mote:    Data  are  effect 

size  (D) 

frequencies- 

Cramer's  V  ^ 

.26, 

^^C  lease  10  less  chdn  expecced,  based  on  marginal  cecals. 
^^VC  lease  10  inora  Chan  expecced,  based  on  marginal  cocsls. 

"Rowe  S  SmiCh,    in  press,   had  4  effect  sizes  ChaC  are  noc  incl  'dej  in  chis  Cai- 
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Types  of  Disabilities 

Another  question  of  interest  in  regard  to  our  data  set  is,  what 
disabilities  have  been  the  targets  of  attitude  modification  research,  and 
have  there  been  changes  over  the  years?  As  can  be  seen  in  Table  13,  the  most 
frequent  target  of  the  attitude  change  studies  from  which  our  Ds  came  was  the 
general  category  of  persons  with  disabilities  (N  =  330).  Attitudes  toward 
persons  with  physical  disabilities  were  a  distant  second  (N  =  97),  with 
attitudes  toward  mentally  retarded  persons  third  (N  =  81),  and  attitudes 
toward  the  mentally  ill  fourth  (N  -  67). 

There  have  been  some  changes  over  the  years  in  the  disabilities  that 
have  been  the  targets  of  attitude  modification  studies.  Table  13  contains 
information  about  those  changes.  Particularly  striking  is  the  increase 
during  1976-86  in  effect  sizes  that  came  from  studies  in  which  disabilities 
in  general  were  the  attitude  target  of  concern.  Given  the  evidenco  that 
attitudes  do  differ  according  to  disabilities  (although  accordi^^j  to  Yuker 
[1983],  the  differences  may  not  be  as  uniformly  stable  as  some  [e.g., 
Richardson  &  Ronald,  1977J*  have  claimed),  a  broad  focus  on  changing 
attitudes  toward  disabilities  in  general  may  not  be  a  wise  strategy.  In 
contrast  is  the  increased  number  of  effect  sizes  during  1976-80  that  came 
from  studies  of  attitudes  toward  specific  mentcil  retardation  levels,  rather 
than  toward  mental  retardation  generally.  However,  the  general  drop-off  in 
1981-86  in  effect  sizes  from  research  on  attitudes  toward  mentally  retarded 
persons,  persons  with  physical  disabilities,  and  those  who  are  mentally  ill 
may  be  perplexing  to  some  professionals  and  advocates.    The  absence  of 


*Also,    Abroms  and  Kodera  (1979),    Antonak  (1980),    Richardson^  Goodman, 
Hastorf,  and  Dornbusch  (1961),  and  Tringo  (1970). 
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Table  13 


frequencies  of  Disabilities  Toward  Which  Treatment  Directed,  by  Year 
(T  X  C#  T  X  P,  Pre-post#  A  vs.  B  effect  sizes  included) 


Mentally  Retarded 

Par  1- 

Total 

Year 

General^ 

Physi'tally 
Disabled 

General 

Moderate  Severe 

Mentally 
111 

Emotionally 
Disturbed 

Visually 
Impaired 

Hearing 
Impaired 

Learning 
Disabled 

Ouidra- 

pi€  :ic 

Other 
Physical 

Other 

Comoinatim 

Effect 
Sizes 

1955-60 

0 

0 

0  0 

0 

0 

0 

0 

0 

0 

0 

2 

0 

3 

1961-65 

2 

0 

5 

0  1 

12 

0 

3 

0 

0 

0 

0 

0 

0 

23 

1966-70 

gb 

10 

7 

0  0 

18^ 

5 

1 

0 

0 

0 

7 

0 

6 

63 

1971-75 

33^ 

18 

12 

0  1 

igC 

0 

6 

0 

0 

0 

0 

2 

0 

91 

1976-80 

141 

46 

6^ 

20^  13 

18 

4 

6 

9 

2 

2 

2 

0 

17 

286 

19B1-86 

144^ 

23b 

8 

0  8 

0^ 

4 

4 

15 

4 

0 

2 

10 

13 

235 

Total 

330 

97 

38 

20  23 

67 

13 

20 

2A 

6 

2 

11 

14 

701^ 

Note.  See 

conventions  for  Category  C.5. 

in  Appendix  C  for  definitions  of 

disability 

categories. 

Datd  dre 

effect  size  D 

frequencies.  Cramer 

s  V  = 

.3i. 

^No  disability  target  was  specified. 

^At  least  10  less  than  expected,  based  on  marginal  totals. 
^At  least  10  more    han  expected,  based  cn  marginal  totals. 
Rowe  &  Smit'A,  in  press#  had  4  effect  sizes  that  are  not  included  in  this  table. 
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research  on  attitudes  toward  persons  with  some  disabilities/  such  as  speech- 
language  impairments/  may  also  be  a  matter  of  concern.  On  the  other  hand/ 
even  the  modest  increase  in  effect  sizes  from  research  on  modifying  attitudes 
toward  the  hearing  impaired  during  1976-86  may  be  encouraging  to  others. 

Educational  Level 

Also  of  interest  in  terms  of  the  thrust  of  research  in  modifying 
attitudes  toward  persons  with  disabilities  is  the  educational  level  of  the 
subjects  used  in  the  research*  Table  14  shows  that  nearly  the  same  number  of 
effect  sizes  came  from  college  and  university  samples  (N  =  313;  44%)  as  from 
elementary  and  secondary  schools  combined  (N  =  282;  40%)/  with  the  majority 
of  the  latter  group  coming  from  the  elementary  grades.  Little  research  has 
been  done  with  adult  groups/  with  a  total  of  only  103  effect  sizes  (14%) 
coming  from  studies  in  which  samples  of  postprofessional  persons  or  adults 
not  in  school  were  use\ 

Assessment 

What  methods  have  been  used  to  assess  attitudes  in  the  research  on 
modifying  attitudes  toward  persons  with  disabilities?  Given  the  concerns 
commonly  expressed  about  the  validity  of  questionnaires  for  predicting  the 
behavioral  aspects  of  attitudes/  it  is  disconcerting  to  find  that  66%  (N  = 
460)  of  the  effect  sizes  we  obtained  were  based  on  questionnaire  data  (see 
Table  15)/  with  items  usually  of  the  Likert-scale  type.  (It  will  not  be 
surprising  to  those  familiar  with  the  research  literature  which  we  coded  to 
know  that  for  44%  [N  =  201]  of  the  questionnaire-based  Ds,  the  Attitudes 
Toward  Disabled  Persons  [ATDP]  scale  [Yuker  et  al./   1970:  Yuker  &  Block/ 
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Table  14 

Educational  Level  of  the  Ss  for  704  Effect  Sizes 


Effect 

Sizes 

Context 

N 

% 



Can't  tell 

5 

1 

School 
Preschool 

3 

0.4 

Primary 

30 

4 

Intermediate 

116 

16 

Middle 

18 

3 

Junior  High 

17 

2 

Senior  High 

19 

3 

CorriDination 
Total 

74 

282 

10 

43 

College 

Undergraduate 

283 

40 

Graduate 
Total 

30 

313 

4 

44 

Postprofessional 

53 

7 

Adults  not  in 
school 

50 

7 

Other 

6 

1 

Total 

704 

101 

ERIC 


Note.  Pro po'Ct ions  in  this 
table  and  the  ones  that  follow 
were  rounded  to  two  decimal 
places  before  being  converted 
to  percentages/  except  where 
less  than  .C05  to  avoid 
rounding  to  zero  where  there 
was  a  frequency,  hlso,  as  a 
result  of  rounding,  percentages 
will  sometimes  add  up  to 
slightly  more  or  less  than  100. 
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TJble  15 

Frequencies  ot  Types  of  Aasessroenc  by  /ear  (T  <       T  X  ?/  Pre-wsc,  ^  vs,  a  etcecc  si2es  include'i) 


Assessment  Type 


Incarview 

Total 


o true c need 

-Jonscruccured 

^uescionnaire 

Socicmetric 

Sociil 
Oiscance 

oyscemacic 
Obiervacion 

Differencial 

Telephone 

t'ro3eccive 
Tec Unique 

3encence 
Completion 

Adjective 
"hec'cl  i3C 

Rankings 

Ocher 

cf fecc 
Sizes 

D55-50 

0 

0 

0 

0 

0 

0 

0 

0 

2 

I 

0 

0 

0 

3 

1961-65 

0 

17 

D 

D 

0 

3 

0 

0 

0 

2 

0 

I 

23 

1965-70 

0 

0 

43 

0 

Z 

0 

U 

0 

0 

0 

2 

0 

5 

63 

1971-75 

0 

0 

79^ 

0 

3 

0 

3 

0 

0 

0 

0 

0 

I 

91 

1975-30 

I 

0 

HI 

4 

31 

2 

47° 

2 

0 

I 

4 

3 

10 

236 

I98i-'i6 

0 

2 

140^ 

6 

27 

I 

10^ 

2 

0 

0 

I 

22 

235 

rota- 

I 

2 

460 

10 

63 

3 

79 

4 

2 

2 

32 

4 

39 

701^ 

Ncce.    Daca  are  -if fecc  size  (D)  frequencies.    Cramer's  V  -  .44. 

'^At  lease  10  less  Chan  jxpecced,  base^i  on  marginal  Locals. 

least:  10  lore  cnan  expecced,  based  on  nurgmal  cocals. 
^Rc'we  <;  imich.   in  press,  had  <;  effecc  sizes  cMaC  ire  noc  included  in  chis  Cable. 
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1986]  was  the  assessment  tool. )  Only  3  Ds  were  based  on  systematic 
observation  data,  and  only  2  of  the  4  Ds  that  came  from  telephone  or  mail 
surveys  were  aiir.ed  at  obtaining  a  response  to  a  situation  removed  from  the 
research  project  (such  as  a  poll  of  opinions  toward  use  of  university  money 
for  a  Center  for  Disabled  Students)  so  that  Ss  would  not  be  likely  to  see  a 
connection  to  the  research  and  thus  give  biased  responses  (not  in  Table  15). 
It  should  be  noted  that  the  39  "Other"  effect  sizes  included  those  obtained 
with  assessments  of  behavioral  intentions  (e.g./  responses  to  questions  about 
intent  or  willingness  to  invite  a  disabled  person  home  or  to  volunteer  to 
work  with  disabled  persons).  Clearly,  assessment  of  attitudes  was  dominated 
by  questionnaires,  especially  if  all  paper-and-pencil  assessments,  such  as 
social  distance  and  semantic  differential  results,  are  included  with  the 
typically  Likert  scale  questionnaire  results.  Only  12  Ds  (2%)  came  from 
nonpaper-and-pencil  assessments  (in^^/iews,  observation,  telephone  surveys). 

Some  changes  in  as-sessment  are  of  interest.  Two  of  these  are  the 
relative  dropoff  in  questionnaire  use  in  1981-86,  accompanied  by  an  increase 
in  use  of  adjective  checklists.  Also,  the  cause  of  the  brief  flare  of 
semantic  differential  use  in  reports  that  became  available  in  1976-80,  nearly 
20  years  after  the  landmark  Osgood,  Succi,  and  Tannebaum  (1957)  publication, 
raises  intriguing  questions  about  the  acceptance,  use,  and  reporting  of 
research  techniques. 

Data  Collection 

Methods  of  data  collection  can  affect  study  outcomes.  One  particularly 
relevant  question  is  whether  those  who  administered  pretests  and  posttests 
were  blind  to  the  purpose  of  the  study  and  to  the  experimental  group 
membership  of  the  Ss  to  whom  they  administered  assessmeiits.    The  convention 


for  coding  whether  blinded  data  collection  occurred  was  based  on  the 
assumption  that  researchcjrs  will  usually  report  those  aspects  of  their 
methodology  that  are  especially  valued  in  treatises  on  research  design. 
Random  sampling  and  assignment  are  such  high  priority  procedures.  Blinded 
data  collection  is  another.  So,  if  there  was  no  mention  that  test 
administrators  we;:e  blind  to  the  purpose  of  the  research  or  to  the  Ss'  group 
membership/  it  was  presumed  not  to  have  occurred. 

For  663  out  ot  705  effect  sizes  (94%),  "No"  was  coded  for  blinded 
collection.  For  21  effect  sizes  (4%),  blinded  data  collection  was  obvious, 
with  partial  blinding  (information  kept  from  coders  as  to  group  membership  or 
whether  a  pre  or  posttest  was  being  administered)  for  5  effect  sizes  (nearly 
1%).  Enough  information  was  provided  for  9  effect  sizes  (1%)  to  make  the 
rater  unsure,   and  "Can't  tell"  was  coded. 

Test  scoring  is  an  essential  part  of  data  collection,  and  the  blinded 
scoring  of  tests  is  desirable  in  '^ases  where  test  scorers  must  draw 
inferences*  As  already  noted  above,  paper-and-pencil ,  questionnaire-type 
assessments  were  the  predominant  mode  of  dependent  Treasure.  With  sucn 
assessments,  coding  is  routine.  It  is  not  surprising,  therefore,  that  for 
674  of  705  effect  sizes  (96%),  the  category  on  blinded  test  scoring  was  coded 
as  "Not  Applicable".  For  11  effect  sizes  (nearly  2%)  for  which  blinded  score 
was  pertinent,   it  was  done;  for  14  effect  sizes  (2%),  it  was  not. 

Reliability 

The  reliability  of  the  scores  obtained  on  dependent  measures  is  a 
central  concern  in  research,  as  low  reliability  has  a  negative  impact  on 
validity  as  well  as  attenuating  group  differences.     It  is  surprising. 
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therefore,  that  for  44  percent  (N  =  293)  of  the  705  effect  sizes  (see  Table 
16),  no  reliability  coefficient  was  reported.  (For  9  effect  sizes,  the 
adequacy  of  reliability  was  asserted,  but  no  coefficient  was  reported.)  For 
those  effect  sizes  for  which  a  reliability  coefficient  was  reported,  187 
coefficients  (26%  of  the  705  effect  sizes)  fell  in  the  range  of  .80  to  1;  200 
coefficients  (28%)  in  the  range  of  .60  to  .79;  and  25  (3%)  below  .60. 

Many  of  the  researchers  were  apparently  not  mindf  il  that  reliability  is 
an  attribute  of  scores,  not  of  tescs,  and  that  coefficients  ct..n  vary  widely 
by  population  and  test  administration  circumstance:  About  64%  of  the  reported 
reliability  coefficients  came  from  other  studies.  (Fv.j  example,  when  the 
ATDP  scale  was  used,  it  was  common  for  the  authors  to  cite  the  reliability 
figures  given  in  Yuker  et  al.  [1970],  and  not  to  report  a  coefficient  for 
their  sample.)  Another  9  percent  of  the  reported  coefficients  came  from 
pilot  studies.  For  4  percent  of  the  effect  sizes  for  which  a  reliability 
coefficient  was  reported,  no  source  could  be  discerned.  Only  24%  of  the 
coefficients  reported  (for  14%  of  the  effect  sizes)  were  computed  for  the 
samples  of  Ss  studied. 

Validity 

Did  test  score  validity  fare  any  better  in  our  population  of  studies?  A 
crucial  starting  point  in  the  consideration  of  thf^  validity  of  measures  for 
assessing  attitudes  would  seem  to  be  definition  of  the  construct,  "attitude". 
However,  Towner  (1984)  lamented  the  lack  of  definitions  of  "attitude"  in  the 
modification  studies  she  reviewed.  In  our  data  set,  no  definition  of 
"attitude"  was  given  in  114  of  215  studies  (53%),  accounting  for  317  of  705 
effect  sizes  (45%)  (see  Table  17).  For  those  effect  sizes  for  which  a 
definition  was  presented,  a  conception  of  attitudes  as  having  affective, 
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Magnitude  of  Reliability  Coefficients 
for  Dependent  Measures  in  Effect  Size  Comparisons 


# 


Effect 

Sizes 

Magnitude 

N 

% 

None  reported 

293 

42 

.00  -  .60 

25 

3 

.60  -  .79 

200 

28 

.80  -  1.00 

187 

26 

Total 

705 

99 

Table  17 

Definitions  of  "Attitude** 
in  Research  Reports 


Effect 

Sizes 

Reports 

Type  of  Definition 

% 

N 

% 

None 

317 

45 

114 

53 

Affective 

65 

9 

22 

10 

# 

Cognitive 

9 

1 

4 

2 

Behavioral 

5 

1 

3 

1 

Affective  and  Cognitive 

88 

12 

21 

10 

Affective  and  Behavioral 

13 

2 

3 

1 

Affective,  Cognitive/  BehaviorcO 

208 

29 

48 

22 

^ERLC 

Total 

705  ' 

;99 

215 

99 
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cognitive/  and  behavioral  components  was  most  common  (N  =  208;  54%  of  388 
effect  sizes).  A  definition  that  involved  affective  and  cognitive  components 
was  next  in  frequency  {N  =  88;  23%  of  388  effect  sizes),  followed  oy  an 
exclusive  emphasis  on  affect  (N  =  65;  17%).  The  other  definitions 
(Cognitive,  Behavioral,  Affective  and  Behavioral;  N  =  27)  constituted  7%  of 
the  effect  sizes  for  which  definitions  could  be  coded. 

Given  the  large  proportion  of  effect  sizes  for  which  the  object  of  the 
experimental  treatment,  "attitude;.",  was  not  defined,  it  is  not  surprising 
that  for  374  out  of  705  effect  sizes  (53%),  the  validity  of  scores  on  the 
dependent  measure  was  not  discussed*^  For  only  32  effect  sizes  (4%)  was 
there  extensive  discussion  of  test  validity.  Moreover,  for  94%  (N  =  660)  of 
the  effect  sizes  the  dependent  measure  was  coced  as  "high"  in  reactivity 
(with  "low"  [1%]  and  "moderate"  [5%]  the  other  choices).  The  adequacy  of 
validity  was  rated  as  "moderate"  for  573  effect  sizes  (81%),  "low"  for  122 
(17%),  and  "high"  for  only  10  effect  sizes  (1%). 

Time  of  Posttest 

Changes  in  attitudes  toward  persons  with  disabilities  must  be  sustained, 
not  temporary,  to  be  of  consequence.  Towner  (1984)  called  for  testing  to 
determine  the  "longterm  effects"  of  treatments  to  modify  attitudes  (p.  254). 
Our  data  set  confirms  the  need  for  that  call.  For  476  effect  sizes  (67%),  an 
immediate  posttest  was  the  source  of  data.  For  90  effect  sizes (13%),  the 
posttest  was  not  immediate,  but  was  delayed  as  much  as  a  week  to  obscure  the 
connection  with  *:he  treatment.  Only  89  effect  sizes  (13%)  were  based  on 
foliow--up  posttesi^ing — i.e«;   testing  that  follov;e<3  an  Initial  posttesc*  (For 


*With  attitudes  clearly  a  psychological  construct,  it  was  perplexing  that 
several  authors  (e.g.,  Lapp,  1974;  Ozyurek,  1977)  referred  to  the  "content 
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7%  of  the  effect  sizes,  the  rater  could  not  determine  when  the  posttest  was 
administered. ) 

Use  o£  Theory 

Given  the  scant  attention  paid  to  attitudes  as  a  construe^:  and  to  the 
validity  of  the  attitude  assessments  used,  it  would  have  been  surprising  to 
find  careful  attention  given  to  the  theoretical  bases  for  the  attitude 
modification  techniques  investigated  in  the  various  studies.  Towner  (1984) 
noted  that  a  majority  of  the  reports  of  attitude  modification  she  reviewed 
did  not  indicate  a  theoretical  base  for  the  approach  taken  to  attitude 
change.  In  our  data  set,  only  194  effect  sizes  out  of  705  (27%)  came  from 
comE>arisons  in  which  an  attitude  change  theory  was  the  explicit  basis  for  the 
experimental  treatment.  The  most  common  basis  was  prior  research  (N  =  403; 
57%),  wi^h  the  case  "well  developed"  for  308  effect  sizes  (76%  of  403),  with 
"few  citations  of  prior  studies"  for  91  effect  sizes  (23%  of  403),  and  with 
prior  research  "mentioned  but  not  cited"  for  4  effect  sizes  (1%). 

As  Table  18  indicates,  the  predominant  theory  either  used  explicitly  as 
a  base  for  a  treatment  (194  effect  sizes;  see  paragraph  above)  or  implicit  in 
the  intervention  (as  judged  by  the  rater  with  no  direct  evidence  in  the 
report;  458  effect  sizes  of  705,  or  65%)  was  the  consistency-equilibrium 
theory  associated  with  theorists  such  as  Festiiger,  Heider,  Lecky,  Levin, 
McGuire,  and  Newcomb  (see  the  conventions  for  Category  C.2.a.  in  Appendix  C 
for  theory  definitions).  The  data  in  Table  18  must  be  interpreted  with 
caution,  however,  in  light  of  the  large  number  of  effect  sizes  for  which  the 
theoretical  bases  for  the  modification  technique  had  to  be  inferred*  The 
most  apt  generalization  is  probably  that  the  research  on  modifying  attitudes 
toward  disabled  persons  is  largely  atheoretical. 
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Table  18 

Attitude  Change  Theories 
Underlying  Experimental  Treatments 


Effect  Sizes 
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Theory  N 


SR,  Behavioral 

29 

4 

Conditioning 

24 

3 

Consistency/equilibrium 

518 

73 

Social  Juc3gment 

14 

2 

Functional 

59 

8 

Combination 

61 

9 

Total 

705 

99 

It  was/  in  particular/  puzzling  find  no  attitude  change  procedures 
based  on  Rokea'^h's  (1973)  version  of  balance  (consistency-equilibrium) 
theory.  His  approach  to  attitude  change  is  to  induce  self-dissatisfaction 
with  values,  as  a  means  to  value  change,  attitude  change,  and  behavioral 
change.  The  approach  has  been  applied  successfully  in  areas  as  diverse  as 
civil  rights  (Rokeach,  1971,  1973;  Rokeach  6r  McLellan,  1972),  teaching 
behavior  (Greenstein,  1976),  and  women's  rights  and  the  environment  (Ball-- 
Rokeach,  Rokeach  Si  Grube,  1984).  merits  attention  by  those  interested  in 
affecting  not  only  modifications  in  attitudes,  but  changes  in  behavior  toward 
persons  with  disabilities. 

Study  Populations  and  Samples 

There  is  some  evidence  that  educational  researchers  do  not  often  address 
specifically  in  their  reports  the  nature  of  their  target  or  accessible 
population,  nor  draw  random  samples  from  their  accessible  populations,  make 
random  assignmen*:s  to  treatments,  or  replicate  their  results  to  establish 
their  stability  and  generalizabili ty  (Shaver  &  Norton,  1980a,  b).  Is  that 
statement  applicable  to  the  body  of  research  on  modifying  attitudes  coward 
persons  with  disabilities? 

Table  19  indicates  that  those  doing  research  in  this  area  have  addressed 
population  issues  evm  less  often  than  those  who  have  published  in  two  social 
studies  journals  and  in  ten  years  of  the  American  Educational  Research 
Journal  (AERJ).  For  a  majority  of  ^:he  effect  sizes  in  the  reports  coded  for 
this  review*  there  was  no  mention  of  the  groups  to  which  the  authors  hoped 
their  results  would  be  generalizc-ole  (target  population — 73%)  or  Ei.om  which 
their  samples  came  (accessible  population — 61%)*  In  fact,  few  authors  even 
used  that  terminology.    For  7  effect  sizes  (1%),   the  term  "target  population" 
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Table  19 


Treatment  of  Target  and 
Accessible  Populations 
for  705  Effect  Sizes 


Target 
Population 


Accessible 
Population 


Effect  Sizes  %  Reports^  Effect  Sizes  %  Reports^ 


Category  N 

% 

Social 
Studies 

AERJ 

N 

% 

Social 
Studies 

AERJ 

Not  mentioned  512 

73 

45 

67 

432 

61 

17 

49 

Term  used  0 

0 

0 

1 

3 

0.' 

0 

1 

Defined  186 

26 

55 

32 

193 

27 

72 

41 

Described  0 

0 

0 

0 

0 

0 

11 

8 

Term  used  and  7 
Population  defined 

1 

77 

11 

Total  705 

100 

100 

100 

7C5 

99.4 

100 

100 

1 8  3 

^Percentages  from  Table  2  in  Shaver  and  Norton  (1980a),  based  on  53  research  reports 
in  all  issues  of  two  social  studies  journals  through  1978  and  151  reports  in  the 
American  Educational  Research  Journal  (AERJ)  for  tsn  years,  1968-77. 
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was  used  and  the  population  was  defined.  For  3  effect  sizes  (.4%)  the  term 
"accessible  population"  was  used,  and  for  77  other  effect  sizes  (11%)/  the 
term  was  used  and  the  population  defined  in  at  least  rudimentary  terms. 

By  the  same  token/  as  Table  20  showS/  random  sampling  of  individual  Ss 
was  rare.  It  was  the  basis  for  sampxo  selection  for  31  effect  sizes 
(4%)*.  The  random  selection  of  groups  provided  the  Ss  for  31  effect  sizes 
(4%).  The  use  of  intact  groups  was  the  most  common  means  of  obtaining  a 
sample  (N  =  327  effect  sizes;  46%^.  The  use  of  volunteers  was  comiuon  (N  = 
237  effect  sizes;  34%)/  and  greater  than  for  Shaver  and  Norton's  (1980a) 
sample  of  AERJ  reports  (9%)  and  social  studies  reports  (24%). 

Table  21  presents  information  on  assignment  to  groups/  with  t  ri^  60 
Treatment  A  vs.  B  effect  sizes  not  included.  Random  assignment  of  the 
individuals  or  groups  used  as  the  unit  of  analvsis  was  done  for  35%  of  the 
effect  sizes  (N  =  227)/  including  21  (3%)  instances  of  matching  followed  by 
random  assignment.  This  is  almost  identical  to  the  35%  of  reports  of  random 
assignment  in  Shaver  and  Norton's  lO-year  AERJ  sample  and  considerably  above 
the  9%  for  the  reports  in  their  social  studies  research  sample  (Shaver  & 
Norton/  1980b)**. 

Replications 

Related  to  the  task  of  defining  the  populations  from  which  samples  are 
drawn  and  to  which  one  wants  to  generalize  is  the  matter  of  replication/  as 


*This  compares  to  15%  and  19%/  respectively/  for  the  samples  of  reports  from 
two  social  studies  journals  and  AERJ  reported  by  Shaver  and  Norton  (1980a). 
The  Shaver  and  Norton  data  are  not  reported  fully  in  Tab] e  20  because 
different  categories  were  used. 

**Inf ormation  from  Shaver  and  Norton  (l^BOa)  was  not  included  in  Table  21 
because  different  categories  were  used. 
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Table 

20 

Sample  Selection  for 
705  Effect  Sizes 

Effect 

Sizes 

Category 

N 

% 

Can't  tell 

43 

Random — 
Individuals 

31 

4 

Random — 
Groups 

31 

4 

Volunteer 

237 

34 

Intact  Groups 

327 

46 

Other 

36 

5 

Total 

705 

99 
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Table  21 


Assignment  to  Treatment  Groups 
for  644  Effect  Sizes 


Effect 

Sizes 

Category 

% 

Can't  tell 

23 

4 

Random 

206 

32 

Match-random 

21 

3 

Select  controls 
canuomiy  or 
matched 

3 

0.5 

Intact  groups — 
randomly^ 

130 

20 

Convenience 

154 

24 

Other 

23 

4 

Not  applicable'-' 

34 

13 

Total 

644 

100.5 

^Intact  groups  assigned 
randomly/  but  not  used  as  unit 
of  analysis.  If  assigned 
randomly  and  used  as  unit  of 
analysis/   coded  as  "randonr.'*. 

'^Single-group  studies. 
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it  is  often  argued  to  be  the  basic  scientific  means  of  establishing  the 
reliability  and  generalizability  of  results  (e.g./  Shaver/  1979a).  As 
inspection  of  Table  22  reveals/  replications  have  not  been  a  common  feature 
of  studies  in  modifying  attitudes  towavd  disabled  persons.  About  1.5  percent 
of  the  effect  sizes  came  from  efforts  to  replicate  other  studies.  About  12 
percent  of  the  effect  sizes  came  from  within-study  replications;  however/ 
almost  one-fourfh  of  those  87  effect  sizes  were  "q^aasi-replications" — effect 
sizes  based  on  data  gathered  from  different  samples  or  in  different  setting-^ 
in  the  study  and  coded  separately  even  though  the  researchers  did  not 
recognize  them  as  replications. 

Replicability 

It  is  noteworthy  as  well  that  for  290  of  the  705  effect  sizes  (41%)/  the 
description  of  the  treatment  variable  was  not  deemed  adequate  to  allow 
another  researcher  to  replicate  the  study  (Category  C8.e.).  For  111  effect 
sizes  (16%)/  description  was  deemed  adequate  for  replication;  and  for  304 
(43%)/  description  was  judged  to  be  "somewhat"  adequate. 

A  treatment  must  first  be  implemented  to  be  replicated  later.  However* 
for  only  37  effect  sizes  (5%)  was  the  actual  implementation  of  treatment 
rated  as  "complete"  (Category  C.B.d.).  For  630  effect  sizes  (89%)/ 
implementation  was  judged  to  be  "mostly"  complete/  and  for  38  effect  sizes 
(5%)/  the  treatment  was  rated  as  implemented  "only  in  part".  At  the  same 
time/  for  639  effect  sizes  of  705  (91%)/  no  report  was  made  of  an  effort  to 
verify  that  the  treatment  had  been  implemented  as  intended. 

Qualification  of  Results 
'n  theii   critiques  of  research  reporting/    Shaver  and  Norton  (1980a/  b) 
gathered  data  on  the  extent  to  which  the  authors  of  research  reports 
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Table  22 

Replications  Among  705  Effect  Sizes 


Effect  Sizes 


Type  of  Replication 


Other  Research 

None  696  99 

Direct  3  0.4 

Systematic  6  1 


Total 

Within  Study 
None 

Direct 

Systematic 

Total 


705 

100.4 

618 

88 

7 

1 

80^ 

11 

705 

100 

^Includes  21  "quasi-replications" — 
that  is/  studies  in  which  the 
treatment  was  repeated  on 
different  samples  or  in  different 
settings  and  the  results  were 
coded  separately/  even  though  not 
treated  as  a  replication  by  the 
researchers • 
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restricted  their  conclusions  in  terms  of  the  shortcomings  in  their  accessible 
populations  (as  related  to  the  target  population)  or  samples.  They  found 
conclusions  tempered  by  accessible  popu"*  nation  deficiencies  in  15  percenc  of 
their  social  studies  research  reports  and  in  7  percent  of  their  AERJ 
articles;  23  percent  of  the  social  studies  reports  took  sampling  inadequacies 
into  account  in  drawing  conclusions,  while  11  percent  of  the  AEEU  articles 
did*  We  broadened  the  question  for  this  review,  asking  if  conclusions  were 
qualified  by  reference  to  sampling  or  design  problems,  possible  interactions 
of  personological  or  ecological  variables  with  the  experimental  treatment, 
the  assessments  used,   the  need  for  replication,  or  "other"  considerations. 

Table  23  presents  the  results.  As  can  be  noted,  for  66%  of  the  effect 
sizes,  the  authors  provided  somr  limitation  on  their  conclusions  about  the 
effectiveness  of  the  technique  for  attitude  modification.  The  14  percent  of 
qualifications  based  on  the  sampling  process  is  close  to  those  of  the  Shaver 
and  Norton  study  (see  preceding  paragraph).  Interestingly,  the  largest 
percentage  of  qualifications  (for  260  effect  sizes,  37%)  took  Into  account 
combinations  of  factors.  That  is  encouraging,  although  the  34%  with  no 
qualifications  is  an  offsetting  concern. 

Basus  for  Effect  Sizes 
A  relevant  methodological  matter  for  those  interested  in  meta-analysis 
as  an  approach  to  integrative  literature  reviews  is  the  type  of  information 
that  is  available  for  computing  the  effect  sizes — in  our  review,  £s — which 
are  the  center  of  attention.  As  noted  in  the  discussion  of  our  coding 
instrument  in  Chapter  3,  we  gathered  information  on  the  source  of  each  effect 
size,  and  on  the  scale  of  me:^n  difference  and  the  standard  deviation  used  on 
computing  each  one.    Tables  24-26  present  data  on  the  bases  for  the  645 
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Table  23 


Limitations  on  Conclusions 
About  Treatnient  Effects 
for  705  Effect  Sizes 

Effect 

Sizes 

Limitation 

N 

% 

None 

242 

34 

Sampl ing 

100 

14 

Design 

66 

9 

Measures 

21 

3 

Interactions 

5 

1 

Need  for  replication 

2 

0.3 

Other 

9 

1 

Combination 

260 

37 

Total 

705 

99.3 
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effect  sizes  for  our  treatment  versus  control/  treatment  versus  placebo/  and 
single-group/  pre-posttest  comparisons. 

The  major  source  of  effect  sizes  was  the  calculation  of  Ds  from  data 
available  in  the  research  reports  (see  Table  24).  Approximately  90%  (N  = 
578)  of  our  Ds  were  obtained  in  that  way.  The  other  10%  were  estimated  using 
statistics  such  as  t-ratios  and  F-ratios.  (When  a  £  was  estimated  from  a 
correlated  t  or  an  F  from  an  analysis  of  covariancef  but  the  pre-post  or 
covariate-dependent  measure  coefficient  was  not  available/  a  coefficient 
of  .50  was  used.  See  Appendix  B  for  effect  size  computation  conventions.) 
Although  a  subcategory  was  included  on  the  instrument  ("Available"  in  Table 
24)  for  coding  that  a  standardized  mean  difference  was  available  in  the 
report/  no  use  was  made  of  it.  It  is  interesting  as  well  that  in  every  use 
of  statistics  to  estimate  Ds,  except  for  the  9  Ds  computed  from  analysis  of 
covariance  F~ratios  using  .50  as  the  estimatf^d  coefficient/  the  mean  £  was 
higher  than  the  mean  for  Ds  computed  directly  from  the  data.  Fortunately/ 
with  90%  of  the  Ds  computed  directly  that  bias  had  little  efiiect  on  the 
analysis  of  data. 

Another  methodological  feature  of  our  data  set  having  to  do  with  the 
basis  of  our  effect  sizes  is  the  information  available  for  scaling  mean 
differences.  It  was  decided  in  developing  the  coding  instrument  that  the 
preferred  scale  would  be  raw  mean  gain  scores/  because  they  are  a  scale  of 
the  same  units  as  final  status  scores  (Glass  et  al./  1981/  p.  116)  and  they 
avoid  the  problems  of  covariance-ad justed  scores  when  the  assumption  o£ 
homogeneous  regression  lines  is  .^^ot  met.  If  raw  gain  means  were  not 
available/  raw  posttest  means  were  the  next  choice/  followed  by  covaric:r'.:e 
adjusted  means,  and  then  by  residual  gain  means. 
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Table  24 

Sources  of  644  Effect  Sizes  (Da) 


Source 

Effect 
N 

Sizes 
Mean 

(Ds) 
SD 

Availability 

__ 

.  , 

, ,  , , 

Calculated 

577 

.36 

.60 

t  or  ANOVA  F 

18 

.46 

.84 

Correlated  t 

6 

.40 

.24 

Correlated  t 

(.50) 

15 

.58 

.52 

M-Way  ANOVA 

3 

.42 

.46 

GOVAR  (.50) 

9 

.15 

.20 

Proportions- 

12 

.74 

1.00 

Significance 

level 

2 

1.23 

.00 

Other 

2 

.90 

.01 

Total 

644 

.37 

.61 

Note.    E'^  =  .02. 
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As  can  be  seen  in  Table  25/  387  of  the  Ds  (60%)  in  our  data  set  were 
based  on  raw  mean  gains.  Raw  posttest  means  were  the  basis  for  163  Ds  (25%) 
and  covariance-adjusted  means  for  28  £s  (4%).  No  residual  gain  means  were 
used.  (The  Not  Applicable  c^.tegory  [N  =  66]  encompasses  the  effect  sizes  not 
calculated/  but  estimated  from  available  statistics.)  The  proportion  of 
variance  in  Ds  associated  with  the  type  of  mean  difference  scale  (Eta^)  is 
only  .01.  The  Not  Applicable  mean  £  is  the  one  discrepant  value;  and/  again/ 
with  only  10%  of  the  Ds  falling  in  that  category/  the  effect  of  the  bias  is 
minor. 

For  calculating  standardized  mean  differences/  as  noted  in  Chapter  3/  a 
standard  deviation  based  on  as  much  data  as  possible  that  was  free  of 
treatment  effects  was  preferred.  Pooled  pre  and  posttest  standard  deviations 
(see  Table  26)  were  obtained  for  computing  248  Ds  (38%)/  pretest  st.ndard 
deviations  were  used  for  108  Ds  (17%)/  and  posttest  control  or  placebo  group 
standard  deviations  were  used  for  140  Ds  (22%).  Estimates  of  standard 
deviations  from  various  sources  ( within-group  variances/  or  gain  or 
covariance  adjusted  scores)  and  pooled  posttest  variances  were  used  in 
computing  86  (13%)  of  the  Ds.  Altihough  the  mean  Ds  based  on  these  estimated 
standard  deviations  tend  to  be  higher  than  those  from  sources  free  of 
treatment  effects  (but  similar  to  the  Not  Applicable  mean  D) ,  the  Eta^  is 
only  .02.  The  small  proportion  of  variance  in  the  Ds  that  was  associated 
with  the  source  of  s*:andard  deviations  is  a  reflection  of  the  small  numbers 
of  standard  deviations  Lhat  come  from  within-group  variances. 

Frequency  of  Can't  Tell  Scoring 
One  other  methodological  aspect  of  the  data  set  that  may  have  general 
interest  is  the  frequency  with  which  Can't  Tell  was  coded  for  the  various 
threats  to  treatment  validity  and  internal  validity  (see  discussion  in 
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Table 

25 

Scale  of  Mean 
For  644  Effect 

Differences 
Sizes  (Ds) 

• 

Effect 

Sizes 

(Ds) 

Scale 

N 

Mean 

SD 

Not  Applicable 

66 

.51 

.69 

• 

Raw  Post 

163 

.29 

.68 

Raw  Gain 

387 

•  39 

.57 

COVAR-Adjusted 

2b 

.36 

.30 

Total 

644 

.37 

.61 

Note.     e2  =  .01. 


Table  26 

Standard  Deviations  for  644  Etfect  Sizes  (Ds) 


Effect 

Sizes 

(Ds) 

Source 

N 

Mean 

SD 

Not  Applicable 

62 

.53 

.70 

Post  Control/placebo 

140 

.24 

.67 

Pretest 

108 

.  34 

.76 

Post  &  Pre  pooled 

248 

.37 

.47 

1-way  ANOVA 

15 

.53 

.75 

N-way  ANOVA 

21 

•  45 

.44 

1-way  COVAR 

12 

.47 

.58 

N-way  COVAR 

28 

.61 

.47 

Adjusted  COVAR 

1 

.51 

.00 

Adjusted  gain 

/\ 

.50 

.46 

Pooled  post 

5 

.43 

.54 

Total 

614 

.37 

.61 

NoLe.     e2  =  .02. 
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Chapter  3).  Tables  27  and  28  contain  that  information  for  the  644  effect 
sizes  in  our  main  analyses  (see  Chapter  5).  As  can  be  seen,  the  availability 
of  information  necessary  to  decide  if  a  threat  was  present  varied  greatly 
among  the  threats.  It  is  also  clear  thrt  forced  judgments  about  the  presence 
of  threats  would  have  been  largely  speculative  for  a  large  numbec  of  effect 
sizes.  It  was  our  judgment  in  coding/  supported  by  the  data  in  Tables  27  and 
28,  that  essential  information  is  frequently  missing  from  reports  for 
determining  whether  threats  to  treatment  and  internal  validity  existed  in  the 
research.  We  recommend  that  others  doing  meta-analyses  use  categories  to 
pick  up  the  extent  of  missing  information,  as  our  Can't  Tell  scoring  did. 

Summary 

Although  this  chapter  contains  discussions  of  some  coding  and  analysis 
decisions  in  this  chapter,  the  primary  purpose  /as  to  sketch  some  dimensions 
of  ttie  available  body  of  research  on  modifyiu^^  attitudes  towards  persons  with 
disabilities.*  From  all  of  the  data  available,  some  were  selected  to  provide 
a  flavor  of  the  volume  of  research,  where  it  has  been  reported,  the  types  of 
attitude  modification  techniques  investigated,  with  what  groups  of  people, 
and  to  change  attitudes  toward  what  disabilities.  Some  methodological 
attributes  were  described  as  well  —  types  of  treatment  compari^jons,  the 
adequacy  of  assessment  and  the  use  o£  theory/  sample  selection  and 
assignment,  the  use  of  replications  and  the  implementation  and  verification 
of  treatments,  the  extent  to  which  conclusions  took  into  account  study 
limitations,  the  bases  for  computing  Ds,  and  the  extent  of  "Can't  Tell" 
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*As  noted  at  the  erd  of  Chapter  3,  the  entire  data  set  ^s  available  on 
magnetic  tape  and  can  be  obtained  at  cost  by  anyone  who  wishes  to  explor^^ 
further  the  characteristics  of  the  reported  research  in  this  field. 
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Table  27 

Threats  to  Treatment  Validity  for  644  Effect  Si 


Category 


Can 

♦t  Tell 

Not 

Plausible 

Ml  nor 

Substantial 

Major 

Not 

Applicable 

Threat 

N 

Mean 

SD 

N 
vi 

Mean 

SD 

N 

Mean 

sr 

N 

Mean 

SD 

N 

Mean 

SD 

N 

Mean  SD 

Implementation 

.47 

.63 

.  Jo 

.61 

30 

.18 

.58 

Hawthorne 

160 

.35 

.58 

93 

.29 

.62 

234 

.42 

.65 

157 

.39 

.57 

I 

I 

John  Henr^' 

254 

262 

.  Jb 

.64 

30 

.22 

.32 

13 

.  11 

.39 

85 

.42  .66 

Treatment  Diffusion 

239 

.36 

.71 

142 

.29 

.50 

131 

.46 

.55 

45 

.35 

.41 

5 

1.06 

.87 

82 

.38  .62 

Dissatisfaction/ 
Resentment 

246 

.31 

.53 

303 

.39 

.64 

44 

.53 

.71 

5 

-.25 

.53 

5 

.27 

.25 

41 

.52  .74 

Novel ty/Di sr uption 

183 

.41 

.58 

332 

.36 

.59 

52 

.42 

.92 

7 

.03 

.11 

1 

.41 

.00 

19 

.25  .35 

Exper.  .nenter  Effect/ 
Expectations 

218 

.37 

.56 

23 

.53 

.53 

130 

.35 

.70 

241 

.37 

.59 

31 

.36 

.76 

1 

.75  .00 

Treatment-experimenter 
Confounded 

82 

.36 

.51 

37 

.34 

.61 

98 

.36 

.75 

221 

.52 

.54 

205 

.23 

.50 

1 

.75  .00 

Testing  by  treatment 
interaction 

24 

.54 

.76 

10 

.61 

.81 

151 

.37 

.57 

426 

.37 

.62 

33 

.27 

.38 

Multiple  nreatment 
interference 

573 

.37 

.61 

62 

.50 

.65 

1 

.52 

.00 

8 

-.10 

.28 

Note.    Means  and  standard  deviations  are  for  Ds. 
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Table  28 

Threats  to  Internal  Validity  for  544  Effect  Sizes 


Category 


Pan 

't  Tell 

NOU 

Plausible 

Minor 

Substantial 

Ma  3  or 

Threat 

N 

Mean 

SD 

N 

Mean 

SD 

N 

Mean 

SD 

N 

Mean 

SD 

N 

Mean 

SD 

Maturation 

326 

.43 

.63 

272 

.29 

.58 

9 

.45 

.45 

37 

.46 

.65 

History 

456 

.32 

.55 

66 

.47 

.69 

56 

.38 

.66 

62 

.64 

.79 

4 

.60 

.36 

Testing 

74 

.32 

.50 

433 

.34 

.60 

38 

.53 

.56 

60 

.43 

.56 

39 

.62 

.92 

Instrumentation 

35 

.58 

.93 

29 

.29 

.43 

566 

.36 

.59 

12 

.33 

.42 

2 

1.66 

.23 

Statistical  Regression 

25 

.68 

.84 

592 

,.35 

.59 

18 

.55 

.60 

9 

1.02 

.65 

Selection 

52 

.38 

.58 

1S7 

.41 

.70 

65 

.36 

.46 

236 

.38 

.61 

94 

.28 

.53 

Experimental  Mortality 

153 

.30 

.46 

303 

.44 

.66 

97 

.38 

.70 

66 

.28 

.56 

25 

.25 

.41 

Note.    Means  and  standard  deviations  are  for  Ds. 
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coding  for  treatment  and  internal  validity.  The  intent  was  to  set  a  context 
for  the  reporting  of  our  analyses  to  determine  the  effects  of  attitude 
modification  techniques  as  they  can  be  discerned  from  the  available 
literature.    That  is  the  subject  of  the  next  chapter. 
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CHAPTER  5 

ATTITUDE  MODIFICATION  TECHNIQUES: 
THE  RESULTS 

The  sketch  of  our  data  set  in  the  prior  chapter  does  not  address  the 
central  objective  of  this  review  of  literature:  to  determine  whether 
different  tect*.;iiques  for  modifying  attitudes  toward  persons  with  disabilities 
have  yielded  different  effect  sizes.  Before  turning  to  analyses  that  address 
that  item  more  specifically/  some  prefatory  comments  are  appropriate  in 
regard  to  differences  between  data  analysis  for  an  integrative  review  as 
compared  to  the  typical  data  analysis  approach  for  primary  research  studies* 

Perspective  on  Analysis 
Once  the  data  are  collected  for  a  quantitative,  integrative  review  of 
the  research  literature,  the  next  steps,  as  in  a  primary  research  study,  are 
to  analyze  the  data  and  report  the  results.  Data  analysis  is,  however/  not 
so  straightforward  for  a  quantitative  review  as  for  most  primary  experimental 
research.  In  conducting  primary  experimental  research,  there  are  usually  few 
variables  to  take  into  account — a  treatment  variable  with  two  or  three 
levels,  one  or  perhaps  two  dependent  measures,  and  maybe  a  covariate/  or 
classification  variable^  or  two.  The  analysis  can  be  pretty  well  specified 
beforehand/  with  the  only  open  question  likely  to  be  whether  the  correlation 
of  a  covariate  with  a  dependent  measure  is  t^uf  f iciently  high  to  justify  an 
analysis  of  covariance.  Beyond  that,  the  researcher  conducts  an  analysis 
(using  a  technique  such  as  ANOVA  or  COVAR),  checks  for  statistical 
significance,  doeo  any  contrasts  of  pairs  of  means  that  are  appropriate,  and 
reports  the  results  in  a  table  or  two  accompanied  by  a  brief  discussion  of 
whether  statistical  significance  (usually  the  conventional  .05  level)  was 


173 

201 


attained.  Occasionally  now,  the  researcher  also  reports  an  effect  size 
(standardized  mean  difference  or  correlation  coefficient)  to  back  up  the 
inferential  statistics  results. 

All  of  the  above  is  accomplished  via  a  relatively  noncircuitous  route; 
once  the  preplanned  analysis  is  set  in  gear,  few  decisions  are  necessary 
about  the  paths  to  follow.  The  decisions  to  be  made,  rather  automatic  ones, 
are  such  as  whether  to  include  a  covariate,  whether  a  finding  attained 
statistical  significance,  whether  post  hoc  contrasts  should  be  made,  and,  on 
occasion,  whether  the  results  from  analyses  with  inferential  statistics 
square  with  those  from  effect  size  computations. 

This  admittedly  somewhat  overdrawn  picture  of  the  simplicity  of  analysis 
in  primary  research  is  in  marked  contrast  to  the  analysis  in  an  integrative 
review  such  as  reported  here.  To  begin  with,  there  is  a  greater  number  of 
variables  to  take  into  account.  The  nunber  of  what  might  be  termed 
"treatment  levels"  may  not  appear  large  at  first  glance.  For  example,  we 
coded  six  attitude  change  techniques,  plus  combinations  (Category  C.4.).  But 
rather  than  a  defined  treatment  (e.g.,  "information"  or  "contact")  applied  in 
a  particular  setting,  as  ia  a  typical  research  project,  the  review  analyst 
has  a  number  of  applications  of  the  treatment  with  differing  characteristics 
across  applications.  As  a  result,  the  number  of  possible  treatment  variations 
is  large.  The  number  becomes  larger  when  variations  in  sample  attributes  and 
perhaps  even  ecological  variables  are  taken  into  account.  Of  course,  the 
complexity  is  further  compounded  when  different  dependent  measures  are  used 
across  primary  research  studies,  and  even  further  by  the  consideration  of 
research  quality  variables.  With  some  140  categories  for  coding  the  research 
reports,  as  in  this  review,  the  number  of  possible  combinations  is 
astronomical. 
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One  solution  would  be  to  use  a  much  smaller  number  of  categories  to  code 
the  primary  research  reports/  but  that  would  be  akin  to  the  dysfunctional 
primary  research  strategy  of  not  attending  to  interactions  between  treatment 
and  personological  and  ecological  variables  because  they  make  interpretation 
difficult.  Instead/  we  were  guided  in  setting  up  the  coding  instrument  (as 
discussed  in  Chapter  3)  by  the  admonition  to  include  "all  characteristics  of 
the  primary  studies  that  are  strongly  suspected  of  affecting  the 
findings  ,  .  (Jackson/  1978/  p.  57).  The  upshot  is  a  complex  analysis 
process  with  difficult  decisions  about  what  to  report  and  how.  One  major 
Issue  is  how  to  handle  data  on  the  methodological  quality  of  the  studies  in  a 
data  set,. 

Quality  of  Ref^earch 
The  methodological  quality  of  the  studies  from  which  effect  sizes  are 
collected  has  been  a  source  of  concern  since  Glass  (1976)  first  proposed  the 
use  of  the  meta-analytic  approach  to  integrative  reviews.  Glass's  proposal 
to  include  studies  regardless  of  quality  and  analyze  for  che  effects  of 
quality  (discussed  in  Chapter  3)  drew  a  "garbage  in-garbage  out"  criticism 
from  Eysenck  (1978)  that  is  often  cited  in  discussions  of  meta-analysis. 
Although  the  concept  of  analyzing  for  quality  effects  is  still  controversial 
(Bangert-Drowns/  1986)/  our  stance  in  planning  the  procedures  for  this  review 
wat  the  same  as  Glass's:  That  is,  include  all  studies/  code  for  quality/  and 
determine  if  effect  sizes  covary  with  study  quality. 

Quality  Indicators 

Although  a  number  of  our  coding  categories  are  related  to  quality  of 
study/  three  global  categories  are  particularly  appropriate  rndicators  of 
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methodological  soundness:  general  treatment  validity/  general  internal 
validity/  and  adequacy  of  test  validity.  Each  is  widely  regarded  by 
researchers  to  be  central  to  the  validity  of  experimental  results/  and  each 
is  based  on  information  from  other  categories. 

Internal  validity  is,  for  example,  in  part  a  function  of  how  Ss  are 
assigned  to  treatments.  In  fact/  some  researchers  regard  studies  with  random 
assignment  to  be,  ipso  facto,  of  high  internal  validity.  However,  as  we 
noted  in  Chapter  3,  that  is  not  the  case.  Random  assignment  may  be  a  sine 
qua  non  of  excellent  internal  validity  (as  indicated  by  the  fact  that  random 
assignment  was  used  for  all  but  two  of  the  effect  sizes  coded  in  our  review 
as  coming  from  studies  with  high  internal  validity)/  but  it  does  not 
guarantee  validity.  Even  selection  as  a  threat  to  internal  validity  is 
controlled  by  random  assignment  only  within  chance  limits.  That  is,  groups 
with  quite  different  antecedent  characteristics  can,  and  do,  occur  even  with 
random  assignment/  as  statisticui  theory  indictees  should  be  expected.  So, 
rather  then  consider  the  absence  or  presence  of  random  sampling  per  se  as  an 
indicator  of  study  quality/  we  coded  method  of  assignment  and  took  that  into 
account  in  coding  the  various  threats  to  internal  validity  (see  Table  28  in 
Chapter  4)  that  were  weighed  in  coming  to  a  judgment  about  the  internal 
validity  of  each  study.* 


Summary  statistics  for  the  three  global  indicators  of  quality  are 
presented  in  Table  29.  Two  attributes  of  the  data  are  striking:  First/  few 
studies  received  high  or  excellent  ratings  on  any  of  the  three  types  of 

*The  Cramer's  V  for  internal  validity  and  type  of  assignment  to  groups  was 
.44/  a  moderately  high  relationship/  as  would  be  expected. 
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Table  29 
Quality  of  Study  Indicators 


General  Treatment  Validity 

General 

Internal  Validity 

Adequacy  of  Test  Validity 

Effect 

Sizes 

(Ds) 

Effect 

Sizes 

(Ds) 

Effect 

Sizes  (Ds) 

H 

Quality 

N 

Mean 

SD 

Level 

N 

Mean 

SD 

Adequacy 

N 

Mean  SD 

Excellent 

4 

.25 

.30 

High 

15 

.89 

.87 

High 

9 

1.13  .69 

Fair 

245 

.45 

.68 

Medium 

211 

.58 

Moderate 

520 

.36  .62 

Poor 

395 

.33 

.56 

Low 

418 

.38 

.61 

Low 

115 

.4C'  .55 

Total 

644 

.37 

.61 

Total 

644 

.37 

.61 

Total 

644 

.37  .61 

Note.  Eta^ 

=  .01 

Note.    Eta^  =  .02 

Note.  Eta^ 

=  .02 

206 


ERIC 


global  validity.  Second,  none  of  the  ratir-^s  of  validity  explain  much  of  the 
variability  in  effect  sizes  (as  indicated  by  the  Eta^s  of  .01,  .02,  and  .03). 
The  low  correlation  between  quality  ratings  and  Ds  is  at  least  in  part  a 
function  of  the  lack  of  variability  in  the  former:  -ew  effect  sizes  came 
from  studies  with  excellent  or  high  ratings. 

To  determine  the  association  between  Ds  and  membership  in  the  higher 
frequency  medium  and  low  quality  categories,  point  biserial  coefficients  were 
computed.  The  squared  coefficients  are  .004,  .01,  and  .001  for  treatment, 
internal,  and  test  validity,  respectively,  again  indicating  that  very  little 
variance  in  Ds  was  associated  with  quality  racings. 

Why  Few  High  Ratings? 

The  lack  of  studies  rated  to  be  high  on  the  quality  indicators  could  be 
a  function  of  the  actual  quality  of  research  in  the  field  or  of  invalid 
ratings.  In  regard  to  the  latter  possibility,  it  is  worth  noting  that  the 
raters  were  able  to  apply  the  research  quality  conventions  reliably;  readers 
will  have  to  peruse  our  discussion  of  our  codin  strument  in  Chapter  3  and 
the  conventions  (Appendix  C)  to  determine  if  we  were  over-rigorous  in  our 
definitions  of  excellent  or  high  validity.     We  think  not. 

After  reading  the  reports  of  the  273  studies  that  made  up  our 
population,  it  is  our  conclusion  that  the  quality  is  actually  not  very  high 
in  general,  due  to  two  factors.  The  first  is  that  attitude  research  is 
difficult  to  conduct,  especially  in  applied  settings  (e.g.,  in  elementary 
schools)  rather  than  laboratories.  There  are  major  impediments  to  gaining 
adequate  control  over  subject  and  ecological  variables  in  applied  settings. 
Moreover,  major  issues  remain  to  be  addressed  before  adequate  instruments  for 
assessing  attitudes  toward  persons  with  disabilities  will  be  readily 
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available  (see.  e.g.,  Makas,  1985).  Even  careful  researchers  are  sorely 
tested  to  produce  studies  that  have  no  flav;s  in  internal  validity  when  faced 
wit*'  the  challenges  of  applied  psychological  research. 

Beyond  the  difficulty  in  conducting  attitude  change  research/  however, 
another  reason  for  the  lack  of  high  quality  ratings  accorded  to  the  studies 
in  our  sample  is  simply  poor  design  and  execution  (as  well  as  inadequate 
reporting,  if  better  methodology  was  used  than  we  were  able  to  discern). 
Some  examples  from  our  data  illustrate  the  point:  For  some  65%  of  the  effect 
sizes,  randomization  of  Ss  was  not  reported;  in  24%,  conveniently  available 
groups  were  used  for  the  treatment  and  control  conditions.  For  only  about  4% 
of  the  effect  sizes  were  data  collectors  either  fully  or  partially  blinded. 
For  41%  of  the  effect  sizes,  there  was  no  mention  of  a  reliability 
coefficient  for  the  dependent  measure  scores;  and,  for  the  47%  of  the  effect 
sizes  for  which  reliability  coefficients  were  reported,  only  24%  had  been 
computed  or  the  study  sample  (64%  came  from  other  reports  of  research).  For 
93%  of  the  effect  sizes,  the  reactivity  of  the  attitude  assessment  was  judged 
to  be  high.  And,  for  83%  of  the  effect  sizes,  the  research  reports  contained 
no  mention  of  any  effort  to  verify  implementation  of  the  treatment 
independent  ^nriablec  It  would  be  difficult  to  argue  that  the  available  body 
of  research  on  modifying  attitudes  toward  persons  with  disabilities  is 
exemplary  in  methodology. 

A  certain  irony  is  hinted  at  in  the  data  reported  in  Table  29. 
Frequently,  internal  validity  is  construed  as  concributing  to  large  group 
differences  that  are  inadvertently  attributed  to  treatment  effects.  Although 
it  is  recognized  thut  low  test  score  reliability  attenuates  mean  differences 
(as  well  as  correlation  coefficients),  obscuring  true  treat ment-outcome 
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relationshipSf  it  is  often  not  recognized  that  other  threats  to  experimental 
validity  can  have  the  same  effect.  To  the  extent  that  studies  are  poorly 
designed  and  executed/  treatments  may  be  less  powerful  or  differences 
otherwise  obscured  by  threats  to  validity.  While  the  number  of  effects  from 
studies  with  "excellent''  treatment  validity  was  too  small  (N  =  4)  in  our  data 
set  to  be  interpretable,  the  results  from  the  somewhat  larger  numbers  of 
effect  sizes  for  high  internal  and  test  validity  (15  and  9)  indicate  that 
higher  mean  Ds  may  come  from  high  quality  'studies.  Even  in  those  two  caseSf 
however/  the  numbers  are  so  small  that  it  is  difficult  to  place  confidence  in 
the  stability  of  the  findings.  Nevertheless,  they  raise  interesting 
implications  for  validity-outcome  relationships. 

A  " Be s  t-e V idence "  Approach 

In  any  event/  the  results  with  treatment/  internal/  and  test  validity 
pose  a  quandary.  On  the  one  hand/  there  appears  to  be  little  association  in 
our  data  set  between  the  magnitude  of  Ds  and  the  quality  of  the  studies  from 
which  they  come/  at  least  as  assessed  via  these  global  indicators.  The 
logical  conclusion  is,  therefore/  that  quality  of  study  need  not  be  a 
consideration  in  analyzing  the  data;  that  is,  it  did  not  produce  much  of  the 
variability  ir  results  for  different  attitude  modification  techniques  which 
can  be  seen  in  Table  34.  On  the  other  hand/  it  can  be  argued  {see,  e.g./ 
Bangert-Drowns/  1986/  p.  392)  that  unless  the  studies  being  reviewed  vary 
widely  in  methodological  rigor/  it  makes  little  sense  to  examine  study 
quality-outcome  relationships.  In  particular/  without  a  sufficient  number  of 
high-quality  studies/  it  can  be  contended/  one  lacks  an  adequate  comparative 
base  for  determining  whether  or  how  study  quality  affected  results.  For  our 
data  set/  we  only  have  a  hint  that  the  Ds  might  have  been  higher  if  the 
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studies  from  which  they  came  had  been  carried  out  with  greater  methodological 
rigor. 

If  this  review  were  being  conducted  from  a  non-Glassian  stance  that 
studies  with  methodological  flaws  should  be  excluded  from  the  analysis,  our 
data  set  would  shrink  appreciably.  Moreover,  given  the  lack  of  association 
between  effect  sizes  and  moderate  or  low  quality  categorization,  there  would 
be  little  reason  to  discard  the  low  quality  studies  and  analyze  only  effect 
sizes  from  the  medium  quality  ones. 

What  might  Siem  like  a  readily  apparent  solution  to  some — that  is,  not 
attempt  any  integrative  review — is  argued  against  by  Slavin  (1986)  in  his 
proposal  for  "best  evidence"  research  syntheses.  Clearly,  if  high  quality 
studies  are  available,  they  should  be  relied  on  in  an  integrative  review. 
But  if  such  studies  do  not  exist,  it  is  appropriate  to  "cautiously  examine 
the  less  well  designed  studies  to  see  if  there  is  adequate  unbiased 
information  to  come  to  any  conclusion"  (p.  6). 

Slavin  argues  that  the  application  of  a  prior  criteria  in  selecting 
"best  evidence"  studies  (rather  than  Glassian  exhaustive  inclusion,  followed 
by  quality-outcome  analyses)  is  at  the  heart  of  the  "best  evidence"  synthesis 
approach  (p.  6).  He  does  not,  however,  present  any  compelling  reason  to 
reject  as  "best  evidence"  studies  that  have  been  selected  because  they  are 
topic-relevant  but  turn  out  not  to  yield  an  adequate  basis  for  determining 
the  association  oetween  study  quality  and  results.  The  biggest  impediment  in 
such  a  best-evidence  review  would  be  exercising  the  necessary  caution  without 
any  basis  for  deciding  which  information  is  "unbiased"  and,  therefore, 
legitimate  to  use  in  drawing  conclusions. 

We  have  proceeded  with  our  analysis  in  a  form  of  "best-evidence"  review 
which  Slavin  obviously  did  not  intend  to  support.     As  Bangert-Drowns  (1986) 
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has  pointed  out,  such  a  decision  depends  in  large  part  on  the  purpose  of  the 
integrative  review.  An  appropriate  goal/  rather  than  to  construct  theory  or 
estimate  treatment  parameters/  is  to  characterize  the  available  research  as  a 
basis  not  only  for  insights  into  treatment  effectiveness/  but  for  decisions 
about  further  research.  With  those  objectives/  careful  summarization  of  past 
research  has  a  significant  place/  even  if  only  to  make  evident  that  which 
remains  to  be  done. 

Overall  Effects 

In  Chapter  4,  we  discussed  the  decision  to  focus  our  analysis  on  effect 
sizes  that  came  from  comparisons  of  attitude  modification  techniques  against 
the  absence  of  treatment  (i.e./  a  control  or  placebo  group/  or  pre- treatment 
scores).  Unless  specifically  noted/  the  reports  of  effect  sizes  that  follow 
come  from  treatment  versus  control  (T  vs.  C),  treatment  versus  placebo  (T  vs. 
P)/   or  single-group/   pre-posttest  (Pre-post)  comparisons. 

Some  preliminary  information  is  pertinent  to  the  general  perception  of 
the  treatment  effect  sizes.  One  such  bit  of  information  is  the  research 
report  authors'  own  views  of  the  effectiveness  of  their  atticude  change 
treatments.  Table  30  indicates  nearly  a  balance  between  the  number  of  effect 
sizes  for  which  the  authors  concluded  the  treatment  was  effective  (N  =  285; 
44%)  and  those  for  which  the  treatment  was  deemed  not  to  have  had  an  effect 
(N  =  259;  40%).  Considered  with  the  40  effect  sizes  (6%)  for  which  the 
reb'ilts  were  considered  equivocal/  along  with  the  19  (3%)  effect  sizes  for 
which  it  was  concluded  that  the  effect  was  negative/  the  summary  of 
conclusions  suggests  that  it  should  not  be  easily  assumed  that  the  use  of 
ju£'t  ariy  modification  technique  will  lead  to  a  positive  effect.  Note/  too, 
that  tlie  mean  Ds  parallel  closely  the  conclusions  drawn/  with  means  of  .74 
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Table  30 

Research  Report  Authors* 
Conclusions  re  Treatment  Effectiveness 


Effect  Sizes  (Ds) 


Conclusion 

N 

% 

Mean 

SD 

None  stated 

42 

6 

.34 

.41 

No  effect 

253 

40 

.03 

.32 

Equivocal 

40 

6 

.51 

.49 

Produced  effect 

284 

44 

.74 

.61 

Negative  ei'fect 

20 

3 

-.63 

.36 

Total 

544 

99  a 

.37 

.61 

Note.    Eta-^  =  .37. 

^n  this  and  later  tables/  percentages  may 
not  always  add  up  to  100  because  of 
rounding  error. 
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for  "produced  effect"/  .51  for  "equivocal"/  .03  for  "no  effect"/  and  -•64  for 
"negative  effect".  The  Eta^  of  .37  indicates  a  moderate  association  between 
the  conclusions  drawn  and  the  magnitude  of  effect  sizes* 

Along  the  same  lines/  there  was  a  small  association  (Eta^  =  .22)  between 
the  availability  and  level  of  statistical  significance  and  the  Ds  obtained 
(see  Table  31).  The  173  effect  sizes  (27%)  for  which  statistical 
significance  was  not  available  yielded  a  mean  D  of  .41.  As  expected/  the 
mean  D  for  statistically  significant  effect  sizes  was  much  higher  chan  that 
for  nonstatistically  significant  ones,  .73  versus  •06.  The  point  biserial 
coefficient  for  magnitude  of  Ds  and  membership  in  the  "net  significant"  and 
"significant  at  .05"  groups  is  .47  (r^j^  =  .22). 

Relevant  to  the  number  of  conclusions  about  negative  effects  (Table  30) 
are  the  number  of  effect  sizes  for  which  an  attitude  modification  treatment 
group  showed  a  negative  change*  As  Table  32  reveals/  for  12  percent  of  the 
effect  sizes  (N  =  77)  the  treatment  group's  post  test  mean  was  lower  than  its 
pretest  mean/  producing  a  mean  £  of  -.19.  It  should  be  noted  that  a 
"negative  change"  by  a  treatment  group  is  not  the  same  as  a  "negative  D"/ 
except  for  single-group/  pre-posttest  comparisons.  If  the  control  or  placebo 
group  had  a  greater  negative  change/  the  D  would  be  positive;  also/  the 
treatment  group  might  have  had  a  positive  change  that  was  less  than  the 
positive  change  of  the  control  or  placebo  group/  yielding  a  negative  £• 

Comparisons  of  Experimental  Treatments 

What  about  thtr  outcomes  of  the  comparisons  of  experimental  treatment 
groups  against  control  or  placebo  groups  or  pretest  scores?  The  various 
treatment  techniques  and  combinations  of  techniques  are  briefly  described  in 
Table  33.    They  are  arranged  in  rank  order  in  Table  34/  according  to  the 
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Table  31 

Numbers  of  Effect  Sizes  for  Which 
Statistical  Significance  Information  was  A^^ailable 


Effect 

Sizes 

(Ds) 

statistical 
Significance 

N 

Mean 

SD 

Not  Available 

173 

.41 

.46 

Not  Significant 

258 

.06 

.34 

Significant  at  .05 

213 

.73 

.75 

Total 

644 

.37 

.61 

Note.    Eta2  =  .22, 


Table  32 

Chdnges  by  Attitude  Modification  Treatment  Groups 


Direction  of 

Effect 

Sizes  (Ds) 

Experimental  Group 
Change 

N 

% 

Mean 

3D 

"ositive 

567 

88 

.45 

.60 

Negative 

77 

12 

-.19 

.35 

Total 

644 

100 

.37 

.61 

135 
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Table  33 


Brief  Descriptions  of  Attitude 
Modification  Techniques  as  Coded 


Technique 


Description 


Information 


Direct  Contact 


Vicarious  Experience 


Information  on  disabilities  (e.g.,  etiologyr 
characteristics,  proble'?.s,  similarities  with 
nondisabled/  prostheses)  provided  by  means 
such  as  speakers,  films,  and  books 

Ss  in  situation  where  they  observe  or 
interact  with  persons  with  disabilities 

Ss  put  in  situations  to  help  them  experience 
what  it  is  like  to  have  disabilities 


Persuasive  Message 


An  argument  presented  via  persons  or  printed 
or  electronic  media  to  convince  Ss  that  they 
should  have  positive  attitudes  toward  persons 
with  disabilities 


Persuasive  ivic.ssage.  Contrast 


Different  messages  or  media  used  with 
treatment  groups  to  investigate  Lelative 
effectiveness 


Systematic  Desensicization 


Positive  Rei»"»£orcement 


Other 


Thinking  about  disabled  persons  in  relaxed, 
non threatening  settings  to  extinguish 
negative  attitudes 

Use  of  classical  or  operant  conditioning  to 
modify  behavior  assumed  to  reflect  attitudes 

Any  combination  of  techniques  other  than 
Information  Plus  Direct  Contact  or 
Information  Pxus  Vicarious  Experience,  which 
were  coded  separately 
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Table  34 

Effect  Sizes  for  Attitude  Modification  Techniques 


Rank 

Tprhniaue 

Effect 

Sizes 

(Ds) 

Differences 

Between 

Means^ 

N 

1  J  cell  1 

SD 

7 

o 
o 

Q 

1 

Persuasive  Message 

23 

.67 

.56 

.16  .24 

.27  .28 

.35 

.38 

.47 

.54 

2 

Information  Plus  Contact 

100 

.51 

.66 

.08 

.11  .12 

.19 

.22 

.31 

.38 

3 

Direct  Contact 

93 

.43 

.73 

.03  .04 

.11 

.14 

.23 

.30 

4 

Vicarious  Experience 

58 

.40 

.76 

.01 

.08 

.11 

.20 

.27 

5 

Other 

71 

.  39 

.64 

.07 

.10 

.19 

.26 

6 

Systematic  Desensitization 

21 

.32 

.44 

.03 

.12 

.19 

7 

Information 

203 

.29 

.51 

.09 

.16 

8 

Information  Plus  Vicarious 

62 

.20 

.36 

.07 

9 

Persuasive  Message/  Contrast 

11 

.13^ 

.33 

Positive  Reinforcement^ 

2 

(1.74) 

(.01) 

Total  644         .37  .61 


?Ton  of  11  Ds  came  from  one  study. 

°^co  few  efTect  sizes  (less  than  10)  to  be  interpretable,  and  so  not  ranked. 

^Numbers  correspond  to  those  for  ranks  of  techniques.  For  example/  the  difference  between  the  Persuasive  Message 
T2\\\r^^^  Information  Plus  Contact  mean  (2)  is  .15  (.67  -  .52), 


magnitude  of  mean  DSo  The  mean  effect  sizes  (£s)  for  the  attitude 
modification  techniques  can  be  viewed  from  two  perspectives:  (1)  What  does 
the  average  D  for  each  treatment  technique  indicate  about  its  effects  as 
compared  to  no  treat.tient?  (2)  What  is  indicated  about  the  relative 
effectiveness  of  the  different  techniques? 

Although  we  have  cautioned  against  the  use  of  conventions  to  judge  the 
magnitude  of  effect  sizes  when  the  standards  are  arbitrary  because  there  is 
no  basis  by  which  to  judge  the  importance  of  variations  in  outcomes/  it  is 
difficult  to  discuss  results  with  no  criteria  in  mind.  Lacking  more  firmly 
grounded  conventions/  Cohen's  (1977)  criteria  for  small  (d  =  .2;)/  medium  (d 
=  .5)/  and  large  (d  =  .8)  effect  sizes  provide  a  useful  frame/  if  applied 
with  caution. 

From  that  perspective/  it  is  worth  noting  that  none  of  the  mean  Ds  reach 
the  .8  criterion/  although  the  mean  D  ^'^r  the  Persuasive  Message  studies 
is  .67/  closer  to  the  large  effect  size  criterion  (.8)  than  to  the  medium  one 
(.5).  The  differences  between  the  Persuasive  Messages  mean  £  and  the  mean  Ds 
for  the  other  attitude  modification  techniques  are  all  above  the  arbitrary 
standard  for  trivial  differences  (.12)  which  we  set  in  Chapter  3.  Moreover/ 
in  three  case."  the  difference  is  greater  than  the  standard  for  a  medium 
difference  (.31)/  approaching  the  standard  for  a  large  diffe^rence  (.50)  in 
one  instance. 

That  messages  developed  purposely  with  an  argument  to  sway  attitudes 
would  have  the  largest  effect  size  on  the  average  makes  sense.  It  also  may 
be  of  significance  that  78%  of  the  23  Persuasive  Message  effect  siz<?s  come 
from  studies  in  which  the  theory  base  (S-R/behavioral  for  11/ 
congruity/equilibrium  for  5/  and  social  judgment  for  6)  was  explicit  and  the 
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relationship  ♦:o  the  treatment  well-developed,  (For  "explicit  theory  base"/ 
the  closest  percentage  was  Systematic  Desensitization  with  75%/  dropping  then 
to  Information  Plus  Vicarious  Experience  with  31%;  for  "explicit  relationship 
to  treatment"/  the  same  relationship  held  except  that  "Other"  was  third 
highest/  with  34%.) 

The  Information  Plus  Contact  studies  produced  the  next  largest  mean 
D;  .51,  falling  right  at  the  arbitrary  criterion  for  a  medium  effect  size. 
Note  again  that  the  Information  Plus  Contact  mean  £  is  .16  below  that  for 
Persuasive  Messages,  barely  larger  than  the  arbitrary  standard  for  trivial 
differences  which  we  set  in  Chapter  3.  At  the  same  time/  the  differences 
between  Information  Plus  Contact/  on  the  one  hand/  and  Contact  and  Vicarious 
Experience/  on  the  other/  .08  and  .11/  are  both  less  than  the  .12  trivial 
difference  standard;  but  the  Information  Plus  Contact  mean  D  equals  or 
exceeds  the  .12  criterion  for  all  other  comparisons/  equaling  or  exceeding 
the  criterion  for  a  moderate  difference  (.31)  in  two  instances. 

The  next  three  mean  Ds  are  clustered  closely  together — .43  for 
Contact/  .40  for  Vicarious  Experiences,  and  .39  for  Other  (combinations  of 
techniques  other  than  the  two  in  Table  34) — with  Ds  that  fall  at  the  midpoint 
of  Cohen's  criteria  for  small  and  medium  effect  sizes  (.2  and  .5).  The 
differences  between  mean  Ds  that  are  lower  in  the  ranking  are  non-trivial 
only  for  the  Information  Plus  Vicarious  Experience  and  Persuasive  Message, 
Contrast  techniques/  which  yielded  small  Ds  (.20  and  .13).  The  two  remaining 
Ds — for  Systematic  Desensitization  (.32)  and  Information  (.29)  are  somewhat 
larger  than  the  .20  small  effect  size  s*-andard/  and  only  slightly  higher  than 
the  means  below  them. 

To  sum  up/  although  the  mean  Ds  for  the  various  techniques  range 
from  .67  to  .13/  clearly  a  broad  range/  there  are  no  clear  demarcations  or 
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groupings  of  techniques.  In  only  one  case  (Persuasive  Message  versus 
Information  Plus  Contact)  is  the  difference  between  contiguous  means  greater 
than  our  index  of  triviality  (.12).  The  use  of  Persuasive  Messages  seems 
clearly  to  have  resulted  in  larger  Ds  on  the  average  than  any  other 
technique.  Contact  Plus  Information  runs  a  close  second/  and  its  use  seems 
clearly  to  have  produced  larger  Ds  on  the  average  than  the  use  of  Systematic 
Desensitization  and  the  techniques  ranked  below  it.  At  the  other  end  of  the 
scale/  the  effect  sizes  that  came  from  efforts  to  investigate  different 
persuasive  messages  or  media  for  delivering  them  indicate  general 
ineffect:'  'eness/   not  even  yielding  a  "small"  (.20)  mean  D. 

Treatment  Variability^ — Homogeneity  of  Ds 

While  it  might  be  tempting  to  look  at  the  rankings  in  Table  34  as  an 
index  of  effectiveness  to  be  used  in  a  singular  fashion  in  selecting  a 
technique  to  modify  attitudes  toward  those  with  disabilities/  that  would 
obviously  be  too  simplistic  an  interpretation  of  a  complex  set  of  data.  To 
begin  with/  the  standard  deviations  associated  with  each  mean  £  serve  as  a 
reminder  that  the  effects  of  each  technique  are  not  homogeneous;  obvjously/ 
there  is  considerable  overlap  among  the  distributions  of  Ds  for  the  various 
techniques.  Moreover/  it  is  important  to  remember  that  included  in  the  Ds 
summarized  by  the  means  in  Table  34  are  negative  values/  indicating  that/ 
rc.^ative  to  the  comparison  group/  a  treatment  had  a  negative  rather  than 
positive  effect. 

Table  35  presents  a  summary  of  the  150  negative  effect  sizes.  Two 
things  are  worth  notings  First/  the  percentage  of  negative  effect  sizes  for 
each  technique  is  roughly  proportional  to  the  percentage  of  effect  sizes 
contributed  to  the  total  644.     No  one  technique  contributed  a  markedly 
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Table  35 

Negative  Effect  Sizes  (Ds)  for  tne 
Attitude  Modification  Techniques 


Negative  Effect  Sizes  (Ds) 

%  of 
Negative 

Technique 

SD 

Technique 
Ds= 

Persuasive  Message 

1/23 

1/4 

(-.36)d 

(.00)^ 

(.04)^ 

Information  Plus  Contact 

19/100 

13/15 

-.2i) 

.29 

19 

Direct  Contact 

18/93 

12/14 

-.20 

.  17 

19 

Vicarious  Experience 

17/58 

11/9 

-.36 

.42 

29 

Other 

18/71 

12/11 

-.38 

.31 

25 

oysLeiTiaLic  uesensi LizaLion 

4/21 

3/3 

(-.27)«3 

( .29)'^ 

Information 

53/203 

35/31 

-.30 

.32 

26 

Information  Plus  Vicarious 

16/62 

11/10 

-.24 

.19 

26 

Persuasive  Message/  Contrast 

4/11 

3/2 

(-.14)«3 

(.10)^ 

(36)^ 

Positive  Reinforcement 

0/2 

0/.3 

Total 

150/644 

101/99.3 

-.29 

.30 

23 

For  Nf  the  first  figure  is  the  number  of  negative  effect  sizes.  Tne  second 
figure  is  the  total  number  of  effect  sizes. 

For  %/  the  first  figure  is  the  percentage  of  the  150  negative  effect  sizes;  the 
second  figure  is  the  percentage  of  the  total  644  effect  sizes. 
^%  of  Negative  Technique  Ds  is  the  percentage  of  the  number  of  the  Ds  for  a 
technique  that  were  negative.     E.g.f  19%  of  the  Information  Plus  Contact  Ds  were 
negative. 

^Too  few  effect  sizes  (less  than  ?0)  to  be  interpretable. 
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disproportionate  number/  or  percentage,  of  negative  £3.  That  Is  also  made 
clear  by  the  last  column.  But,  second,  it  is  remarkable  that  23  percent  (N  = 
150)  of  the  644  £s  were  negative.  That  figure  not  only  highlights  the 
cautionary  note  in  regard  to  keeping  variability  in  mind,  but  raises  serious 
questions  about  the  adequacy  of  the  bases  for  the  attitude  modification 
treatments  that  were  investigator?,  it  also  suggests  that  the  treatments 
grouped  under  each  technique  label  were  not  necessarily  perfectly  alike,  even 
though  quite  different  from  those  grouped  under  other  labels. 

Variation  in  Treatment  Features 

Some  differences  in  treatment  features  within  techniques  are  worth 
consideration.  For  example,  in  Tables  36  and  37,  it  can  be  noted  that  there 
was  considerable  variability  in  both  the  types  of  information  and  the  modes 
of  presenting  it  in  the  studies  of  the  Information  approach  to  attitude 
modification.  The  large  number  of  Combination  ratings  for  both  types  of 
information  and  delivery  mode  also  suggest  further  variability,  in  the  way 
th'it  individual  components  were  put  together. 

Variability  within  treatment  categories  is  also  evident  for  the 
Vicarious  Experience  and  Persuasive  Message  studies  (see  Tables  38  and  39). 
It  is  interesting,  as  well,  that  while  variations  in  type  of  information  and 
mode  of  deliver-^  accounted  for  about  6  to  7  percent  of  the  variance  in 
Information  Ds,  the  percentage  of  variance  attributable  to  treatment 
variations  is  much  larger  for  Vicarious  Experience  and  Persuasive  Message 
Ds — 20%  and  28%,  respectively — suggesting  that  choice  of  technique  features 
could  be  more  important  there. 

Information  gain.  The  studies  of  Information  as  an  attitude  change 
technique  evidenced  another  type  of  variaL/ility.     As  noted  in  Chapter  3, 
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Table  36 

Types  of  Information  Presented  in 
Information  Treatment  Technique  Studies 


Effect  Sizes  (Ds) 


Information 


N 


Mean 


SD 


Characteristics  of 
disabled  persons 

Problems  of  being 
disabled 


Managing  disabled 
children 


2         (.52)^  (.21)^ 


1       (-.08)^  (.00)^ 


Similarities  with  11  .11 

nondisabled 


How  nondisabled  react  11 

How  to  relate  in  2 
social  situations 


.60 


1       (-.17)^  (.00)^ 


.39  .20 
(-.77)^  (.28)^ 


Other  15  .22  .51 

Contoination  160  .31  .52 

Total  203  .29  .51 


Note.    Eta^  =  .06. 

^To  »   few  effect   sizes  (less  than  10)   to  be 
interpretable. 


Table  37 

Information  Delivery  Modes  Used  in 
the  Information  Treatment  Technique  Studies 


Effect  Sizes  (Ds) 


Delivery  Mode 


N 


Mean 


SD 


Lecture 

1 

(.55) 

V  -71) 

Discussion 

IJL 

1  Q 
•  ±0 

.£.1 

Lecture— discussion 

3 

/  no ^ 

Print 

:3 

• 

.44 

Panel-disabled 

1 

(.80) 

(.00) 

Speaker-disabled 

4 

(.22) 

(.17) 

Film,  video 

21 

.40 

.58 

Picture,  f ilmstrip 

4 

(-.02) 

(.63) 

Audio 

7 

(.74) 

(.59) 

Simulations 

1 

(-.08) 

(.00) 

Regular  course 

24 

.32 

.74 

Regular  program 

23 

.18 

.44 

Other 

7 

(.27) 

(.37) 

Combination 

67 

.27 

.46 

Total 

203 

.29 

.51 

Note.  For  moan  Ds  and  standard  deviations  in 
parentheses/  the  number  of  effect  sizes  is 
iess  than  10  and  too  few  to  interpret. 

Eta2  =  .07. 
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Table  38 

Types  of  Experience  in  Vicarious 
Experience  Treatment  Technique  Studies 


Effect  Sizes  (Ds) 


Experience 

Mean 

SD 

Role  play 

7 

(.34) 

(.43) 

Simulation 

26 

.60 

.72 

Observe  role  play 
or  simulatiai 

z 

(-.95) 

(.46) 

Video /  films 

9 

(.05) 

(.22) 

Print/  fiction  or 
biography 

2 

(.05) 

(.C7) 

Other 

1 

(-.09) 

(.00) 

Combination 

11 

.59 

1.06 

Total 

58 

.40 

.76 

Note.  For  mean  Ds  and  standard  deviations  in 
parentheses,  the  number  of  effect  sizes  is 
less  than  10  and  too  few  to  interpret. 

Eta^  =  .20- 
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Table  39 

Types  of  Persuasive  Messages  Presentations  in 
Persuasive  Message  Treatment  Technique  Studies 


Effect  Sizes  (Ds) 


Fresentat  ion 

Mean 

SD 

Video,  film 

3 

(.52) 

(.32) 

Audio 

3 

(.31) 

(.10) 

Expert 

8 

(.48) 

(.40) 

Expert/  disabled 

1 

(1.32) 

(.00) 

Other 

8 

(.99) 

(.74) 

Total 

23 

.67 

.56 

Note.    Eta^  =  .28. 


^All  mean  Ds  and  standard  deviations  are 
in  parentheses  because  the  number  of 
effect  sizes  is  less  than  10  and  too  few 
to  interpret. 
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there  are  two  pertinent  questions  to  be  asked  of  studies  in  which  purveying 
information  is  the  attitude  modification  technique:  Did  the  Ss  learn  the 
intended  content?  And,  to  what  extent  were  gains  in  knowledge  of  tnat 
content  related  to  attitude  outcomes?  The  answer  to  the  first  question  can 
be  construed  as  speaking  to  the  extent  to  which  the  treatment  independent 
variable  was  implemented/  the  second  as  getting  at  the  relationship  between 
different  values  of  the  independent  variable  and  scores  on  the  dependent 
attitude  measure. 

Surprisingly  few  of  the  203  Information  effect  sizes  come  from  studies 
in  which  Ss'  gains  in  knowledge  of  content  were  even  assessed  and  reported — 
43  effect  sizes  out  of  203  (21%)f  as  is  indicated  in  Table  40*  Of  those/ 
sufficient  data  were  available  for  37  effect  sizes  (13%)  to  compute 
Information  Gain  Ds  and  to  obtain  the  correlation  between  Information  Gain 
and  Attitude  £s. 

As  might  be  expected/  whether  or  not  the  authors  concluded  there  was  No 
Gain  or  a  Clear  Gain  in  information  was  highly  associated  with  Information 
Gain  Ds  (rpj^  =  .38).  Authors'  conclusions  in  regard  to  information  gain  were 
not/  however/  highly  associated  with  attitude  £s/  in  large  part  because  so 
many  (N  =  160)  came  from  studies  in  which  attention  was  not  paid  to 
determining  whether  Ss  actually  learned  the  information  which  was  to  have  an 
effect  on  their  attitudes.  For  those  instances  in  which  irj.ormation  gain  was 
assessed  so  that  Ds  could  be  computed/  the  relation  with  Ds  for  attitude 
dependent  measures  was  moderately  high  (r  =  .53;  r^  =  .28)*.    Given  the 


*Inspection  of  the  scatter-diagram  indicated  a  rectilinear  regression  line 
was  a  good  fit. 
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Table  40 

Information  Gain  and  Attitudes  in 
Information  Modification  Technique  Studies 


Type  of  Report 
on  Information  Gain 


Total 


None  160 

Narrative  6 

Descriptive  Statistics  3 

Statistical  Significance  5 

Descriptive  Statistics  and  29 
Statistical  Significance 


203 


ERLC 


Effect  Sizes  (Ds) 


Information 


Attitude 


Conclusion  re 
Information  Gain 

N 

Mean 

SD 

N 

Mean 

SD 

Gain  Not  Assessed 

160 

.27 

.50 

Clear  Gain 

25 

1.40 

1.05 

25 

.29 

.64 

No  Gain 

12 

-.07 

.44 

12 

.23 

.44 

Can't  Tell,  Inconclusive^ 

6 

(6)b 

.71 

.16 

Total 

43 

.91 

1.13 

•  203 

.29 

.51 

Note^  r^)  for  Information  Gain  Ds  =  .38;  Eta^  for  Attitude  Ds  =  .02. 
The  overall  Pearson  product-moment  correlation  coefficient  for  the  37 
effect  sizes  for  which  both  Information  Gain  and  Attitude  Ds  were 


available  =  .53;  r 


2  - 


28. 


^Narrative  reports  of  information  gain,   so  information  Ds  could  not 
be  computed, 

^oo  few  effect  sizes  (less  than  10)  to  be  interpretable. 
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stability  often  attribut:*=>d  to  attitudes  and  the  attendant  difficulty  in 
modifying  them,  and  the  likely  other  ways  in  which  the  studies  from  which  the 
Ds  came  varied,  that  correlation  is  fairly  sizeable. 

It  is  unfortunate  that  more  researchers  did  not  attend  to  verifying  that 
increased  knowledge  was  the  result  of  their  efforts,  or  argue  directly  that 
exposure  to  the  information,  not  learning  it,  was  deemed  to  be  the  important 
variable.  In  any  event,  not  only  the  assessment  of  information  gain  but  the 
extent  of  gain  were  other  types  of  variability  among  studies  in  the 
Information  attitude  modification  technique  category. 

Contact  variability.  There  was  also  variability  in  the  contact 
situations  used  in  Direct  Contact  studies  (see  Table  41)  and  in  the 
disabilities  with  which  Ss  were  in  contact.  About  12  percent  of  (Eta^  =  .12) 
the  variance  in  Contact  Ds  was  associated  with  situation  differences,  and 
about  10%  with  differences  in  disabilities.  However,  the  Ns  upon  which  most 
of  the  variance  Ds  in  Tables  41  and  42  are  based  are  so  small  as  to  make 
interpretation  untenable.  The  lack  of  interpretability  is  compounded  by  the 
fact  that  of  the  two  out  of  three  Ds  with  sufficient  Ns  in  Table  41,  one  is  a 
Combination  category  and  the  other  is  an  amorphous  "Other"  category. 
Consequently,  while  Table  -^1  suggests  diversity  of  Contact  studies,  it  tells 
us  little  about  the  effects  of  different  types  of  content.  By  the  same 
token,  the  category  in  Table  42  with  the  largest  N  is  "Combination",  and  only 
twc  other  categories  have  more  than  10  effect  sizes  in  them. 

Of  particular  interest,  as  noted  in  Chapters  2  and  3,  are  differences  in 
contact  that  might,  according  to  theory,  be  related  to  attitude  outcomes. 
Along  those  lines,  it  is  relevant  that  only  12  (13%)  o£  the  Direct  Contact 
effect  sizes  were  coded  as  coming  from  studies  in  which  there  was  an  explicit 
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Table  41 

Contact  Situations  for  tho 
Direct  Contact  Treatment  Technique  Studies 


Effect  Sizes  (Ds) 


Contact 

N 

Mean 

SD 

As  companion 

8 

(.42) 

(.42) 

As  peer  tutor 

2 

(.74) 

(.29) 

In  cooperative  learning 

3 

(.22) 

(.31) 

As  classmates 

8 

(1.13) 

1.81 

Practice  teaching 

4 

(.23) 

(.52) 

In  recreation  program 

4 

(.52) 

(.24) 

Guest  speaker 

18 

.24 

.29 

As  teacher  or  counselor 

8 

(.13) 

(.58) 

Other 

28 

.50 

.72 

Corobination 

10 

.33 

.19 

Total 

93 

.43 

.73 

Note.  For  mean  Ds  and  standard  deviations  in 
parentheses/  the  number  of  effect  sizes  is  less 
than  10  and  too  few  to  interpret. 

Eta^  =  .12. 
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Table  42 

Characteristics  of  Disabled  Persons 
in  Contact  Studies — Disabilities 


Effect  Sizes  (Ds) 


Disability 


Mean 


SD 


ERIC 


Combination 

20 

.35 

.52 

Mentally  ill 

15 

.56 

.53 

MR~Mi  Id/Moderate 

14 

.63 

1.50 

MR— <3eneral 

6 

(.51) 

(.43) 

MR~Can*t  Tell 

5 

(.23) 

(.37) 

Severe  Multiple 

5 

(.36) 

(.28) 

Emotionally  Disturbed 

4 

(.24) 

(.27) 

MR — Severe/Profound 

3 

(.50) 

(.47) 

Deaf 

3 

(-.16) 

(.35) 

Multiole  Disabil i^ ■ 

\  •O/  } 

Physical — General 

2 

(.51) 

(.03) 

Wheelchair 

2 

(.42) 

(.57) 

Paraplegic 

2 

(.17) 

(.18) 

Blind 

2 

(.38) 

(.35) 

Hearing  impaired 

1 

(.75) 

(.00) 

Learning  Disabled 

1 

(1.88) 

(.00) 

Can't  Tell 

6 

(.18) 

(.23) 

Total 

93 

.43 

.73 

Note.  For  mean  Ds  and  standard  deviations  in 
parentheses/  the  number  of  effect  sizes  is  less 
than  10  and  too  few  to  interpret. 


Eta2  =  .10. 


231 


201 


theoretical  basis  for  the  treatment/  and  only  9  U0%)  of  the  D_s  came  from 
comparisons  in  whic.i  the  theoretical  base  was  rated  as  well-developed* 

Data  from  the  categories  on  the  Contact  Coding  Sheet  (see  Appendix  B), 
constructed  (as  noted  in  Chapter  3)  to  reflect  theoretically  important 
attitude  change  elements  of  contact/  give  some  indication  of  diversity  in 
Contact  traatment  and  associated  outcomes.  For  example/  one  factor 
considered  in  contact  theory  is  the  relative  status  of  the  interacting 
disabled  and  nondisabled  persons/  with  equal  status  deemed  important  to  the 
development  of  positive  values.  Table  43  contains  data  on  three  indicators 
of  status  as  well  as  for  an  overall  status  rating. 

Clearly/  there  was  considerable  diversity  among  the  studies  in  our 
population  in  regard  to  the  relative  status  of  nondisabled  Ss  and  the 
disabled  persons  with  whom  they  had  contact.  Moreover/  some  of  the  mean  Ds 
are  perplexing.  As  expected/  the  mean  £  for  "same  age"  contact  (.61)  is 
higher  than  that  for  contact  in  which  the  persons  with  disabilities  are 
younger  (.35).  However/  when  the  persons  with  disabilities  were  older  than 
the  nondisabled  Ss/  the  mean  is  lower  (.23)  than  the  "same  age"  mean.  A 
similar  overall  pattern  holds  for  vocational-educational  prestige/  suggesting 
that/  as  would  be  expected/  that  element  of  prestige  and  age  are  not 
independent.  In  terms  of  "helping  relationships"  there  is  diversity/  too. 
Unfortunately/  there  were  only  9  Ds  for  the  professional  help  category;  the 
mean  for  that  category  doeS/  however/  reflect  the  ob.^ervation  in  the 
literature  that  professionals  who  have  worked  with  disabled  persons  often 
develop  (or  are  assessed  as  having)  more  negative  attitudes.  The  most 
perplexing  finding  is  for  overall  status.  The  comparisons  in  which  the 
persons  with  disabilities  were  rated  as  lower  in  status  produced  the  highest 
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Table  43 


Status  Factors  and  Ovecdll  Status 
in  Con':act  Studies 


Age  Vocational-Educational  Pre::tige 


Effect  Sizes  (Ds) 

Effect  Sizes  (Ds) 

Relative  Age 

N 

Mean 

SO 

Relative  Vocational- 
Educational  Prestige 

N         Mean  SD 

Can't  Tel' 

21 

.47 

.50 

Can't  Tell 

21  .52 

.59 

Disabled  Vounger 

23 

.35 

.70 

Disabled  Lower 

35  .34 

.59 

Same 

24 

.61 

1.15 

Same 

20  .64 

1.21 

Disabled  Older 

17 

.28 

.29 

Disabled  Higher 

17  .28 

.29 

Variety 

8 

(.36)^ 

(.31)^ 

Total 

93  .43 

.73 

Total 

93 

.43 

.73 

Note.     Eta2  =  .03. 

Note.    Eta2  =  .03. 

^oo  few  effect  sizes 
interpretable. 

(less  than 

10)  to  be 

Helping  Relationship 

Overall  Status 

Effect  Sizes 

(Ds) 

Effect  SizQiA 

Ds) 

Helping 
Relationship 

N 

Mean 

SD 

Relative  Status  N 

Mean 

SD 

None 

47 

.38 

.84 

Can't  Tell  2 

(.03)^ 

(.16)^ 

Professional 

9 

(.03)^ 

(.42)^ 

Disabled  Lower  59 

.54 

.88 

Preprofessional 

18 

.73 

.79 

Equal  16 

.24 

.30 

Nonprofessional 

19 

.46 

.38 

Disabled  Higher  16 

.26 

.30 

93 

.43 

,7: 

Total  93 

.43 

.73 

Note.    Eta2  =  .06. 

Note.     Eta^  =  .04. 

^oo    few  effect  sizes 
ba  interpretable. 

(less  than 

10)  to 

^Too  few  effect  sizes 
be  interpretable. 

(less  than 

10)  to 
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mean  D,  .54,  versus  .24  and  .26  for  equal  status  and  higher  status  for  the 
disabled/  respectively. 

Variability  in  Contact  treatments  is  also  evident  in  Table  44.  Only  17% 
of  the  Ss  voluntarily  initiated  contact,  while  53%  (N  =  49)  were  assigned  to 
a  contact  situation.  The  volunteers,  as  would  be  predicted  based  on  theory, 
did  have  a  higher  mean  D  (.48)  than  those  assigned  to  contact  (mean  =  .27). 
Of  course,  most  of  the  "role  choice"  Ss,  for  whom  the  mean  £  was  highest 
(.60),  were  volunteers  in  that  they  chose  course  work  or  employment  knowing 
that  it  would  involve  contact  with  persons  who  had  disabilities. 

There  was  less  variability  in  intimacy  of  contact,  with  70%  (N  =  65)  of 
the  effect  sizes  coming  from  studies  in  which  the  contact  was  rated  as 
casual.    Not  much  is  revealed,  therefore,  with  theoretical  implications. 

There  was  little  variability  in  the  nature  of  the  cooperation  involved 
in  Contact  studies,  with  84%  of  the  effect  sizes  (N  =  78)  concentrated  in  two 
categories — in  which  cooperation  was  rated  as  either  "not  necessary"  (N  =  21) 
or  "implicit"  (N  =  57).  Consistent  with  theory,  the  implicit  cooperation 
mean  £  (.49)  is  higher  than  that  for  the  "not  necessary"  D  (.36),  although 
the  difference  is  just  greater  than  the  triviality  criterion. 

The  results  for  three  categories  developed  to  describe  types  of  contact 
are  not  presented  because  of  the  lack  of  information  in  reports  for  coding 
them.  Little  can  be  said  in  regard  to  the  pleasantness  of  contact,  another 
theoretically  important  factor,  because  there  was  sufficietiw  information  for 
coding  that  category  for  only  10%  of  the  effect  sizes.  By  the  same  token,  it 
could  not  be  determined  from  the  reports  for  99%  of  the  Contact  effect  sizes 
whether  the  Ss  received  explicit  reinforcement  or  shared  internal  or  external 
reinforcement  with  those  who  had  disabilities.     For  95%  of  the  effect  sizes, 
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Table  44 

Type  of  Contact  in  Contact  Studies 


Basis  for  Contact  Initiation 


Intimacy  of  Contact 


Effect  Sizes 

(Ds) 

Basis 

N 

Mean 

SD 

Assigned 

49 

.27 

.59 

Role  choice 

23 

.60 

.75 

Voluntary 

16 

.48 

.42 

Varied 

5 

(1.06)3 

(1.83)3 

Total 

93 

.43 

.73 

Note.    Eta-^  =  .08. 

^Too  few  effect  sizes  (less  than  10),  to 
be  interpreted. 


Intimacy 

Effect  Sizes 

(Ds) 

N 

Mean 

SD 

No  Interaction 

3 

(1.24)3 

(2.10)3 

Casual 

65 

.38 

.55 

Close 

4 

(.39)3 

(.49)3 

Varied 

13 

.31 

.43 

Potential  Contact 

8 

(.75)3 

(1.49)3 

Total 

93 

.43 

.73 

Note.  Eta^ 


.06. 


3too  few  effect  sizes  (less  than  10,  to  be 
interpre  able. 


Extent  of  Cooperat 

ion 

and  Competiti 

Effect  Sizes 

(Ds) 

Cooperation — Compet  it  ion 

N 

Mean 

SD 

Can't  tell 

7 

(.17)3 

(.32)^ 

No  opportunity 

3 

(1.24)3 

(2.10)3 

Not  Necessary 

21 

.36 

.55 

Implicit  cooperation 

57 

.49 

.73 

Explicit  cooperation 

2 

(.07)3 

(.25)3 

Implicit  competition 

1 

(-.47)3 

(.00)3 

Combination 

2 

(.17)3 

(.18)3 

Total 

93 

.43 

.73 

Noto.  Eta- 


.00. 


*roo  few  effect  sizes  (less  than  10)  to  be  interpre- 
table. 
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the  same  lack  of  information  was  encountered  iri  regard  to  the  modeling  of 
positive  interactions. 

Another  theoretical  xmportant  contact  variable  is  the  degree  of  support 
for  positive  attitudes  from  institutional  norms,  persons  in  authority^  and 
peers.  For  76%  of  the  effect  sizes  (N  =  71),  there  was  not  enough 
information  in  the  report  to  code  "yes"  or  "no"  (see  Table  45). 
Nevertheless,  the  higher  mean  D  (.61  versus  .38)  when  support  could  be 
identified,  with  a  difference  of  .24-  is  consistent  with  contact  theory. 

The  characteristics  of  the  disabled  perrons  with  whom  Ss  have  contact 
are  ilso  important,  according  to  the  theory.  In  Table  42,  we  presented  the 
frequencies  of  types  of  disabilities.  Table  46  contains  the  information  that 
could  be  garnered  on  two  other  attributes — the  extent  to  which  the  persons 
with  disabilities  acteu  ii  ways  to  reinforce  negative  stereotypes  and  the 
extent  to  which  they  were  likely  viewed  as  competent  by  the  Ss  or  to  which 
they  openly  acknowledged  and  accepted  any  lack  of  competence  due  to  their 
disability.  There  was  vc*riecy  among  studies,  in  that  for  18  effect  sizes 
(19%)  it  could  be  discerned  that  negative  stereotypes  likely  were  pL jsent, 
but  for  23  (25%)  they  were  not.  However,  there  was  little  difference  in  the 
mean  Ds  (.33  versus  .24).    And,  for  52  (56%)  of  the  Ds,  Can't  Tell  was  coded. 

Again,  for  competence  there  was  some  variety  among  the  studies  from 
which  the  Contact  effect  sizes  came.  The  result  with  the  greatest  potential 
interest  is  for  "Competent",  with  the  high  mean  D  of  .75.  However,  with  only 
9  effect  sizes,  that  result  and  the  .39  mean  difference  with  "Lacked" 
competence,  must  be  treated  with  caution.  The  low  mean  £  (.24)  for  the 
"Acknowledged/accepted"  category  is  also  of  interest  because,  theoretically, 
acknowledging  and  accepting  one's  disability  should  have  a  positive  effect  on 
the  attitudes  of  nondisabled  persons  with  whom  one  has  contact. 
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Table  45 


Extent  of  Institutional  Support 
for  Attitude  Change  in  Contact  Studies 


Effect  Sizes  (Ds) 

Support  N        Mean  SD 

Can't  Tell       71       .38  .73 

Yes  22       .61  .73 

Total  93        .43  .73 

Note.    Eta2  =  .02. 
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Table  46 

Characteristics  of  Persons  with  Disabilities 
Contact  Studies — Negative  Stereotypes  and  Competence 

Negative  Stereotype  Reinforced 


Effect  Sizes  (Ds) 


stereotype 

N 

Mean 

SD 

Can't  Tell 

52 

.55 

.93 

Yes 

18 

.33 

.36 

No 

23 

.24 

.29 

Total 

93 

.43 

.73 

Note.  Eta^ 

=  .04. 

Competence  of  Person  with  Disability 


Effect  Sizes  (Ds) 


Competence  N  Mean  SD 


Can't  Tell 

16 

.68 

1.19 

Lacked 

50 

.36 

.54 

Acknowledged/accepted 

18 

.25 

.27 

Competent 

9 

(.75)3 

(1.14)' 

Total 

93 

.43 

.73 

Note.    Eta-^  =  .05. 

^Too   few  effect   sizes  (less  than  10)  to  be 
interpretable. 
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Finally/  the  characteristics  of  the  nondisabled  Ss/  in  particular  their 
personality  traits  and  their  prior  attitudes/  are  theorized  to  be  of 
importance  to  attitude  change.  We  could  find  little  attention  to  either  in 
the  literature.  For  83  Contact  effect  sizes  (89%)/  personality  traits  were 
not  assessed;  for  none  were  the  relationships  between  personality  variables 
and  attitude  change  from  Contact  analyzed.  Pre-treatment  attitudes  v;ere 
assessed/  as  one  would  expect  given  popular  research  designs  for  applied 
settings/  for  90  (86%)  of  the  effect  sizes;  however/  the  relationship 
of  antecedent  attitudes  to  the  effects  of  contact  on  attitudes  was  analyzed 
for  only  one  effect  size. 

Contact  Summary 

Contact  treatments  varied  widely/  as  did  Information  and  Vicarious 
Experience  as  attitude  modification  techniques.  We  have  discussed  the 
differences  in  Contact  in  the  context  of  theory  as  the  effects  of  contact  are 
of  particular  interest  in  the  field.  Although  approached  from  a  somewhat 
different  perspective/  our  data  largely  support  Makas's  (1986)  conclusion 
that  the  inadequate  design  of  studies  has  precluded  the  productive  testing  of 
the  hypotheses  of  contact  theorists  such  as  Allport  (1954)  and  Amir  (1976). 
The  inadequate  reporting  of  studies  is,  as  well/  a  barrier  to  post  hoc 
efforts  to  check  results  against  theory.  Further  analyses  of  our  data  will 
be  conducted  to  attempt  to  discern  the  effects  of  combinations  of  factors 
(such  as  contact  which  is  both  voluntary  and  with  nonstereotypic  disabled 
persons)/  but  the  potential  fruitfulness  is  limited  by  the  lack  of  attention 
to  theoretically  important  variables/  as  evidenced  by  the  failure  to  report 
the  information  necessary  to  code  them. 
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Attitudes  toward  .  .  .  ?  An  important  treatment  feature  is  the 
disability  toward  which  the  attitude  modification  efforcs  were  directed  As 
Table  47  indicates/  44  percent  (N  =  286)  of  the  effect  sizes  came  from 
studies  in  which  a  target  disability  was  not  specified/  but  efforts  were 
directed  at  changing  attitudes  toward  an  amorphous  category  of  "disabled 
persons  in  general".  The  next  most  frequent  change  target/  attitudes  toward 
general  physical  disabilities  {or,  put  differently/  unspecified  physical 
disabilities)/  was  a  distant  second  with  15  oercent  (N  =  97)  of  the  effect 
sizes.  From  there/  the  number  of  effect  sizes  for  disability  targets  drops 
off  rapidly  to  65  (10%)  for  ^  ^ntally  111,  to  37  and  36  (6%  each)  for  Mentally 
Retarded/  General  (i.e./  level  of  retardation  not  specified)  and  Combination 
(i.e./  more  than  one  disability  target  specified).  Each  of  the  other 
disability  targets  accounts  for  4%  or  less  of  the  644  effect  sizes. 

Two  typen  of  treatment  variability  are  evident  in  Table  47.  First/  the 
effects  of  each  attitude  modification  approach  have  been  investigated  with 
several  disability  targets.  Secondly/  however/  there  is  some  clustering  of 
disability  targets  within  treatments.  For  example/  Contact  effect  sizes  have 
only  come  in  substantial  numbers  (N  of  10  or  more)  from  studies  directed  at 
changing  attitudes  toward  disabled  persons  in  general/  the  mentally  ill,  and 
the  mentally  retarded  in  general*.  Conversely/  substantial  numbers  of  effect 
sizes  for  the  mentally  ill  as  an  attitude  change  target  v-ane  from  studies 
that  investigated  either  Direct  Contact  or  Information  Plus  Contact. 

Moreover/  not  only  the  numbers  but  the  effects  are  not  consistent  within 
disabilities  or  treatments.    That  is,  not  only  are  there  differences  in  total 


*It  does  not  help  interpretation  that  none  of  the  target  disability  effect 
sizes  for  Persuasive  Message  as  a  technique/  which  had  the  highest  overall 
mean  D  (.67)/  is  based  on  an  N  of  10  ot  more. 
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Table  47 

Disabilities  Toward  Which  Mcxjificacion 
Techniques  Were  Directed 


Disability 


Technique 

Disabled 
General 

Physical 
General 

Mentally 
111 

nen Lai ly 
Retarded 
General 

CoFOinaticn 

Hearing 
Impaired 

Moderately 
Retarded 

Severely 
Retarded 

Visually 
Impaired 

Other 

Physically 
Impaired, 
Other 

Emotionally 
Disturbed 

Learning 
Disabled 

Total 

Persuasive  Message 

(.70) 
(.05) 
3 

(.49) 
(.34) 
9 

(.48) 
(.40) 

8 

(1.71) 
(.75) 

3 

— 

— 

- 

- 

- 

— 

— 

- 

- 

.67 
.56 

23 

Information  Plus  Contact 

.53 
.47 
31* 

(.66) 
(.72) 
9 

.20 
.52 
22** 

(.65) 
(.93) 
8 

(.52) 
(1.14) 
8 

— 

(.91) 
(.96) 
6 

.50 
.32 
10 

(.71) 
(.23) 
2 

(.56) 
(.93) 
2 

— 

(1.20) 
(.25) 
2 

— 

.51 
.66 

100 

Direct  Contact 

.41 
.59 

33 

(.26) 
(.26) 
4 

.56 
.53 
15 

.20 
.30 
11 

(.74) 
(.29) 
2 

(.07) 
(.53) 
4 

(.91) 
(1.84) 

9 

(.50) 
(.47) 
3 

(.29) 
(.31) 
7 

(.83) 
(.00) 
1 

(.24) 
(.27), 
4 

.43 
.73 

93 

Vicarious  Experience 

.27 
.84' 
29 

(  61) 
(.50) 
7 

(.41) 
(.30) 
4 

(.30) 
1 

(.79) 
(.51) 
4 

(1.47) 
(1.48) 
3 

(.52) 
(.27) 
3 

(-.01) 
(.17) 
7 

.40 
.76 

58 

Other 

.41 
.47 

30 

.64 
.40 
11 

(.34) 
(.41) 
4 

(1.04) 
(.88) 
6 

(.37) 
(.66) 
3 

-.30 
.49 
13** 

(.67) 
(.00) 
1 

(1.67) 
(.?3) 

2 

(.04) 
(.00) 
1 

— 

.40 
.64 
71 

Systematic  Desensitization 

(.13) 
,(.00) 
1 

(.30) 
(.55) 

5 

(-.10) 
(.49) 
4 

(.71) 
(.20) 
6 

(.25) 
(.13) 
5 

.32 
.44 
21 

Information 

Information  plus  Vicarious 
Experience 

Persuasive  Message,  Contrast 
Positive  Reinforcement 

.23 
.51 
104** 

.15 
.35 
44** 

.13 
.33 
11 

.36 
.51 
43** 

(.18) 
(.18) 
7 

(1.74) 
(.01) 

2 

(.17) 
(.41) 
8* 

— 

(.19) 
(.28) 
6 

(.17) 
(.20) 
2 

.44 
.71 
13 

(.'.^) 
(.59) 
4 

— 

(.22) 
(.17) 
4 

z 

(.12) 
(.41) 
7 

— 

(.25) 
(.59) 
8 

(.55) 
(.21) 

2 

(.42) 
(.18) 
2 

— 

( .10) 
(.33) 
2 

(.95) 
(.36) 
2 

(.87) 
(.03) 
1 

(.18) 
(.30) 
2 

( .  59) 
(^47) 
4 

.29 
.51 

203 

.20 
.36 

62 

.13 
.33 
11 

1.74 
.01 

2 

Total 

.29 
.53 
2B6 

.46 
.52 
97 

.31 
.50 

65 

.56 
.75 

37 

.55 
.71 

36 

.19 
.92 

?4 

.76 
1.32 

20 

.36 
.40 

2D 

,.37 
.42 

20 

.56 

.6: 

n 

.09 
.30 
11 

(.63) 
(.4-9) 

9 

(.46) 
(.44) 

C 

.37 
.61 
644 

Note.    Tne  first  number  in  each 

cell  IS 

the  mean  D, 

the  second 

IS  the  standard  deviation. 

and  ti)e 

third  the  nurrbec 

of  cases. 

Means  and  standard  deviations  in  parentheses  are  based  on  fewer  than  10  C( 


i^-ast  10  fewer  cases  than  expected,  based  on  marginal  frequencies, 
t  KH^aat  10  more  cases  than  expected,  based  on  marginal  frequencies.'  O  A  O 
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mean  Ds  between  disability  targets  (the  Eta^  for  disability  target  and  Ds 
is  .05)/  but  between  mean  £s  within  disability  categories  as  well.  For 
example/  there  is  a  difference  of  ,47  between  the  mean  ?  for  Disabled  General 
(.29)  and  Moderately  Retarded  (.76);  yet/  within  Disabled  General/  the  range 
of  mean  Ds  is  from  .15  (for  Infor;nation  Plus  Vicarious  Experience/  ignoring 
the  .13  for  Persuasive  Message/  Contrast  because  10  of  the  Ds  came  from  the 
same  study)  to  .53  (for  Information  Plus  Contact)/  a  difference  of  .38.  By 
the  same  token/  there  is  considerable  variance  in  mean  Ds  within  treatment 
categories.  For  Information  Plus  Contact  the  mean  Ds  range  from  .20 
(Mentally  111)  to  .66  (Physical  General)/  a  difference  of  .46.  Although  the 
effectiveness  of  treatments  might  appear  to  be  largely  a  function  of 
interactions  with  disability  attitude  targets/  the  disparities  in  Ns  for 
cells/  as  well  as  the  large  number  of  empty  ceils  (not  to  mention  the 
potential  underlying  interactions  with  other  factors  such  as  age  of  Ss)/ 
preclude  such  a  conclusion — or  even  the  use  of  analysis  of  variance  to 
determine  t-^"  proportion  of  the  variance  in  Ds  attributable  to  the  treatment 
by  disability  interaction. 

Implications.  Variations  in  the  featurer>  of  treatment  variables  with 
similar  labels^  such  as  described  in  the  previous  sections/  are  at  the  heart 
of  the  "apple  and  oranges*'  criticism  of  meta-analyoes — that  is,  the  objection 
that  lumping  studies  for  quantitative  analysis  obscures  important  differences 
within  groups  of  treatments  (see/  e.g./  Bangert-Drowns/  1986/  pp.  389-90/ 
392).  That  objection  to  meta-analysis  raises  a  dilemma.  One  horn  is  the 
effects  of  grouping  studies.  The  other  horn  has  to  do  with  adequacy  of 
numbers  of  cases  to  sort  out  the  apples  and  oranges.  If  the  sets  of  effect 
sizes  for  treatment  groups  are  broken  down  to  make  fine,  or  even  fairly 
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gross/  within-treatment  distinctions/  the  number  in  many  cells  is  likely  to 
be  so  small  as  to  yield  findings  in  which  little  confidence  can  be  placed. 
The  problem  is  exacerbated  even  further  if  one  takes  into  account  not  only 
treatment  differences  (those  coded  as  well  as  the  large  number  of  uncoded 
possibilities)/  but  other  study  characteristics  (to  be  discussed  shortly  for 
our  data  set). 

Rather  than  u?"  tha  apple  and  oranges  metaphor  to  refer  to  studies/  or 
effect  sizes,  within  a  treatment  type,  the  metaphor  might  be  more  aptly 
applied  to  refer  to  different  types  of  treatment.  That  is,  if  keeping  the 
doctor  away  is  the  desired  outcome,  it  is  reasonable  to  ask  whether  different 
fruits  have  different  effects.  However,  it  is  especially  important  when 
there  is  considerable  variability  in  outcomes  within  fruit  types  to  keep  in 
mind  that  all  of  the  apples  (or  oranges/  pears,  bananas,  etc.)  were  not  the 
same  nor  were  the  people  who  ate  them  or  the  conditions  under  which  they  were 
eaten.  By  the  same  token/  asking  if  there  are  differences  in  outcomes  among 
categories  of  attitude  modification  techniques  is  legitimate  as  long  as  one 
does  not  lose  sight  of  the  variability  in  study  characteristics  and  outcomes 
within  technique  categories,  and  of  the  extent  to  which  size  of  outcomes  is 
not  neatly  clustered  within  those  categories. 

Other  Study  Characteristics 
As  reported  earlier  in  this  chapter,  the  global  indicators  of 
methodological  quality  are  not  related  to  outcomes  in  our  data.  Variations 
in  treatment  techniques  tend  to  be.  Are  there  other  study  characteristics 
that  are?  What  other  reservations  might  be  necessary  in  drawing  conclusions 
abC'Ut  the  attitude  modification  data  presented  in  Table  34?  One  such  issue 
is  whether  effect  sizes  from  treatment  versus  control  (T  vs.  C),  treatment 


versus  placebo  (T  vs.  P)  f  and  single-group^  pre-posttest  (Pre-post) 
comparisons  should  have  been  pooled  for  analysis. 

Type  of  Comparison 

As  can  be  seen  in  the  first  three  columns  of  Table  48,  the  overall  means 
for  treatment  versus  control  (T  vs.  C)  and  treatment  versus  placebo  (T  vs.  P) 
comparisons  (.36  and  .29/  respectively)  were  closo  to  one  another,  but  the 
difference  between  each  and  the  single-group/  pre--posttest  (Pre-post)  mean  D 
(.49)  was  .13  and  .20,  respectively.  Yet,  the  Eta^  for  the  relationship 
between  comparison  type  ano  magnitude  of  D  is  only  .01.  The  small  Eta 
reflects  in  part  the  small  numbers  of  T  vs.  P  (N  =  49;  7%)  and  Pre-post  (N  = 
97;  15%)  comparisons.  With  77  percent  (N  =  498)  of  the  Ds  in  the  T  vs.  C 
category,  there  was  little  variability  in  comparison  type. 

Another  way  to  approach  the  issue  is  to  ask  whether  excluding  the  Pre- 
post  Ds  from  the  mean  Ds  for  the  treatment  techniques  would  have  affected  the 
picture  portrayed.  As  can  be  seen  in  the  fourth  and  fifth  columns  in  Table 
48,  the  rankings  and  relative  magnitudes  of  the  mean  Ds  remain  essentially 
the  same  when  the  Pre-post  means  are  excluded  and  the  T  vs.  C  and  T  vs.  P 
means  are  pooled.  The  only  changes  in  ranking  are  for  means  which  are  nearly 
identical  in  both  the  T  vs.  C  plus  T  vs.  P  column  and  the  Total  column,  with 
mean  differences  so  small  (.01  to  .03)  that  differences  in  ranks  are 
basically  meaningless. 

It  is  worth  noting,  in  comparing  the  first  three  columns  of  Table  48  (T 
vs.  C,  T  vs.  P,  Pre-post),  that  Pre-post  comparisons  yielded  higher  mean  Ds 
for  Information  Plus  Contact  and  Direct  Contact,  types  of  techniques  likely 
to  be  used  in  college  courses  where  pre-posttest  data  are  often  gathered.  As 
a   matter  of   fact,    63%   (N  =  61)   of  the  Pre-post  Ds  came   from  course 
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Taoie  48 


Treatment  Technique  Effect  Sizes  (Ds) 
by  Type  of  Comparison 


Effect  Sizes  (Ds) 


T  vs.  C 
plus 

T  vs.  C  T  vs.  P  Pre-post  T  vs.  P  Total 


Technique 

N 

Mean 

N 

Mean 

N 

Mean 

N 

Mean 

N 

Mean 

Persuasive  Message 

18 

.53 

3 

(Iwl) 

2 

(.41) 

21 

.70 

23 

.67 

Information  Plus  Contact 

79 

.46 

1 

(.30) 

20 

.75 

80 

.46 

100 

.52 

Direct  Contact 

72 

.35 

3 

(.35) 

18 

.76 

75 

.35 

93 

.43 

Vicarious  Experience 

52 

.32 

2 

(.47) 

4 

(1.41) 

54 

.33 

58 

.40 

Other 

53 

.45 

10 

-.273 

8 

(.84) 

63 

.34 

71 

.39 

Systematic  Desenaitization 

21 

.32 

21 

.32 

21 

.32 

Information 

152 

.33 

24 

.30 

27 

.03 

176 

.33 

203 

.29 

Information  Plus  Vicarious 

37 

.14 

-6 

(.36) 

19 

.28 

43 

.17 

62 

.20 

Persuasive  Message #  Contrast 

11 

.13^ 

11 

.13 

11 

.13 

Positive  Reinforcement 

2 

(1.74) 

2 

1.74 

2 

(1.74) 

Total 

497 

.36 

49 

.29 

98 

.49 

546 

.35 

644 

.37 

Note.  Mean  Ds  based  on  fewer  than  10  effect  sizes  are  considered  too  unstable  to  interpret.  They  are 
in  parentheses. 

Eta^  for  Compar<<ion  Type  (T  vs.  C,  T  vs.  P,  and  Pre-post)  and  magnitude  of  D  is  .01. 

f^ine  of  10  Ds  .  rom  the  same  study, 
^en  of  11  Ds  from  the  same  study. 
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evalucitions  (N  =  39;  40%)  and  program  evaluations  (N  =  22;  23%),  and  nearly 
60  percent  of  the  course  and  program  evaluation  effect  sizes  came  from 
samples  of  college  and  university  students.  So,  even  though  pooling  mean  Ds 
from  the  three  types  of  comparisons  did  not  have  a  significant  impact  on  the 
relative  size  or  rankings  of  treatment  technique  means,  it  must  be  kept  in 
mind  that  single-group,  pre--posttest  comparisons  contributed  heavily  to  the 
mean  Ds  for  certain  attitude  modification  techniques  used  with  certain 
samples. 

Time  of  Posttest 

In  Chapter  4,  the  small  percentage  (13%)  of  effect  sizes  tha^  came  from 
follow-up  posttesting  in  the  total  data  set  was  noted.  The  question  of 
concern  at  this  point  is,  was  the  magnitude  of  Ds  associated  with  time  of 
posttest?  The  answer  to  that  question  bears  on  the  long  ter^t  effects  of  the 
attitude  modification  treatments.  The  answer  also  speaks  to  whether  it  was 
appropriate  to  pool  Ds  from  different  times  of  posttesting  for  analysis,  as 
we  have  done. 

In  coding  time  of  posttesting,  two  types  of  infc"^  mation  were  entered: 
(1)  the  number  of  weeks  between  the  end  of  treatment  and  administration  of 
the  posttest:  and,  (2)  a  code  to  identify  whether  the  posttesting  was 
immediate,  delayed,  or  follow-up.  An  "immediate"  posttest  was  an  initial 
assessment  administered  very  soon  after  the  end  of  the  experimental 
treatment — i.e.,  within  a  day  or  at  the  next  available  meeting  of  the  group. 
A  "delayed"  posttest  was  also  an  inicial  posttest,  but  was  not  administered 
immediately,  sometimes  to  obscure  the  relationship  between  the  treatment  and 
the  test.  A  "follow-up"  posttest  followed  an  initial  posttest,  or  it  may 
have  followed  a  prior  follow-up  posttest. 


ERIC 


247  216 


As  would  be  anticipated/  the  statistics  for  number  of  weeks  from 
treatment  end  to  posttesting  varied  for  the  three  time-of "pos^tssting 
categories.  For  immediate  posttestS/  the  mean  numbor  of  weeks  to  testing 
was  .03/  with  a  standard  deviation  of  .24  (and  a  mode  and  median  of  0.00/  and 
a  range  of  0.00  to  4.3  weeks)?  for  delayed  posttestS/  the  mean  number  of 
weeks  to  testing  wa.^  1.7/  with  a  standard  deviation  of  1.9  (and  a  mode  and 
median  of  1/  and  a  range  of  .20  to  8.00  weeks);  for  follow-up  posttestS/  the 
mean  was  10.4/  with  a  standard  deviation  of  18.4  (and  a  mode  and  median  of  6/ 
with  a  range  of  1.3  to  over  100  weeks). 

As  can  be  noted  from  the  mean  Ds  and  the  Eta^  (.003)  in  Table  49/  there 
is  little  relationship  between  time  of  posttesting  and  average  D.  In 
addition/  when  a  correlation  was  run  between  the  number  of  weeks  after  the 
end  of  treatment  when  the  posttest  (immediate/  delayed/  or  follow-up)  v;as 
administered  and  the  Ds  for  the  586  effect  sizes  for  which  that  information 
was  available/   r  =  .05  (r^  =  .002) — a  very  small  relationship. 

Was  there  a  systematic  ti me-of-posttest  effect  among  treatment 
techniques?  Table  50  has  information  for  the  three  treatments  for  which  the 
N  for  follow-up  posttesting  was  sufficient  (i.e./  at  least  10)  for 
interpretation.  No  pattern  is  evident/  although  there  is  one  nontrivial  (by 
our  .12  criterion)  difference  between  a  posttest  and  follow-up  mean.  For 
Other/  the  follow-up  posttest  mean  (.21)  was  .25  less  than  the  posttest 
mean  £  (.46). 

To  sum  up/  little  evidence  about  "long  term"  effects  is  available  from 
our  data.  What  information  there  is  indicates  little  relationship  between 
time  of  posttesting  and  magnitude  of  And/  there  appeared  to  be  little 
reason  not  to  pool  the  total  644  effect  sizes  for  analysis. 

^  4 


Table  49 


Effect  Sizes  (Ds)  for  Different 
posttesting  Times 


Effect 

Sizes 

(Ds) 

Time  of  Posttest 

N 

iean 

SD 

Can't  Tell 

50 

.28 

.54 

Irmediate 

427 

.38 

.63 

'jelr.yed 

86 

.42 

o60 

(Immediate  Plus  Delayed)^ 

(513) 

(.39) 

(.62) 

Follow-up 

81 

.38 

.57 

Total 

644 

.37 

.61 

Note.    Eta^  =  .003. 

^he  information  from  Immediate  and  Delayed  (within 
one  week  of  end  of  treatment)  posttesting  is 
presented  pooled/  as  well  as  separately/  because 
posttests  that  close  together  would  typically  not 
be  discriminated  in  research  reports/  and  the  mean 
Ds  were  nearly  identical. 


Table  50 


Follow-up  Posttest 

Effect 

Sizes 

(Ds) 

Posttest*^ 

Follow-up 

Technique^ 

N 

Mean 

N 

Mean 

Information 

167 

.27 

23 

.37 

Information  Plus  Contact 

74 

.54 

13 

.42 

Other 

48 

.46 

14 

.21 

^Techniques  included  only  if  the  nuiuoer  of  effect 
sizes  for  either  posttesting  or  follow-up 
posttesting  was  at  least  10. 

^Intnediate  and  delayed  posttest  data  corrbined. 
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Type  of  Dependent  Measure 

Another  aspect  of  testing  that  introduces  study  variability  is  the 
instruments  used  to  assess  attitudes.  As  Bangert-Dcowns  (1986)  has  noted, 
the  "apples  and  oranges"  criticism  of  meta-analysis  has  two  components.  The 
first,  discussed  above,  is  the  concern  about  grouping  together  under  similar 
labels  studies  with  different  features.  The  second  involves  apprehension 
about  the  effects  of  lumping  together  for  analysis  effect  sizes  that  come 
from  different  dependent  measures. 

One  way  of  considering  similarities  or  differences  in  dependent  measures 
is  to  examine  the  assessment  approaches  used.  Information  in  that  regard  is 
presented  in  Table  51,  It  is  clear  that  the  assessment  of  attitudes  was 
dominated  by  questionnaires  (N  =  425;  66%),  most  of  which  were  made  up  of 
Likert-type  items.  The  next  highest  type  of  instrument,  the  semantic 
differential,  is  a  distant  second — N  =  73;  11%,  Only  three  other  assessment 
types  yielded  data  for  at  least  10  effect  size^:  social  distance  scales  (N  = 
53;  9%);  adjective  checklists  (N  =  32;  5%);  and,  a  composite  category  of 
tests  that  didn't  fit  in  any  of  the  major  categories,  "Other",  with  35  effect 
sizes  (5%).  The  Eta^  for  Ds  and  type  of  assest?ment  is  .05,  The  association 
is  not  large,  with  assessment  clearly  dominated  by  Likert*-type  scales. 

A  perplexing  piece  of  information  in  Table  51  is  the  relatively  low  mean 
D  for  social  distance  dependent  measures  (D  =  ,16).  With  that  low  mean  D^,  it 
is  relevant  to  inquire  whether  the  few  social  distance  scale  assessments  were 
associated  predominantly  with  any  attitude  modification  technique. 

One  source  of  evidence  relevant  to  that  question  is  the  overall 
association  between  treatment  technique  and  type  of  assessment.  The  Cramer's 
V  is  only  .18.    And,  that  value  reflects  primarily  a  greater  frequency  of 
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Table  51 
Types  of  Dependent  Measures 


Effect  Sizes  (Ds) 


Assessment  Type 

N 

Mean 

SD 

Questionnaires  (Likert-type) 

425 

.38 

.58 

Semantic  Differential 

73 

.35 

.61 

Social  Distance 

57 

.16 

.41 

Adjective  Checklist 

32 

.48 

.57 

Sociometric 

9 

(1.11) 

(1.71) 

Telephone-Mdil  Request 

4 

(.86) 

(.44) 

Projective  Test/  Pictures 

2 

(.03) 

( .  27 ) 

Sentence  Compl^jtion 

2 

(.08) 

(.45} 

Intervievy-Nonstructured 

2 

(1.66) 

(.23) 

Systematic  Observation 

1 

(.44) 

(.00) 

Rankings 

1 

(.23) 

(.00) 

Other 

36 

.33 

.62 

Total 

644 

.37 

.61 

Note.    Means  and  standard  devi 

ations 

based 

on  fewer 

than  10  £s  are  in  parentheses. 
Eta2  =  .05. 
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questionnaire  testing  with  the  Information  and  Information  Plus  Contact 
techniques  than  expected  based  on  marginal  totals  (148  versus  134  expected 
and  80  versus  67  expected,  respectively)  and  fewer  or  that  type  of  assessment 
than  expected  with  Direct  Contact  studies  (51  versus  61  expected)  and 
Information  Plus  Vicarious  Experience  studies  (19  versus  41  expected).  There 
was  not  a  great  disparity  in  the  expected  and  actual  uses  of  social  distance 
scales/  based  cn  marginal  frequencies  nor,  as  follows/  a  great  concentration 
of  use  by  treatment  technique — as  indicated  in  Table  52.  Moreover/  the 
differences  between  the  mean  Ds  based  on  sufficient  numbers  of  effect  sizes 
to  be  interpretable  are  slight.  The  low  mean  D  for  Information  Plus 
Vicarious  ^^Ixperience  (.07)  is  a  trivial  difference  from  the  mean  D  for 
Information  (.14)  ani  for  Direct  Contact  (.11). 

There  do  not  appear  to  be  systematic  differences  in  dependent  measures 
among  studies  of  different  attitude  modification  techniques  that  would  have 
major  effects  on  outcomes.  This  is/  of  course/  due  in  part  to  the  lack  of 
variability  in  types  of  assessments — i.e./  the  prevalent  use  of 
questionnaires  to  assess  attitudes.  Perplexing  questions  of  construct 
validity  are  raised  by  that  use,  and  the  perplexity  is  piqued  by  the  low  mean 
£  for  social  distance  scale  assessments. 

The  meaning  of  mean  Ds  based  largely  on  reactive  paper-and-pencil 
assessments/  with  so  fev^  coming  from  indirect/  behavioral  methods  of 
assessment  (Antonak/  1986)/  is  a  greater  concern  with  this  data  set  than  is 
the  "apples  "^nd  oranges"  concern  about  intermingling  effect  sizes  from 
differing  dependent  measures.  Rokeach  (1970)  commented  similarly  on  the 
3tate  of  assessment  in  attitude  research  in  general:  Typically/  one  posttest 
is  administered  shortly  after  the  treatment,   with  little  attention  to 
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Table  52 

Social  Distance  Scale  Assessments 


N 


Technique 

Actual 

Expected^ 

Mean 

SD 

IQ 

18 

•  14 

.49 

iX 

o 
O 

•  11 

.30 

vicaLiouo  Cixpenence 

7 

c 
D 

(.22)" 

(.24)" 

Persuasive  Message 

1 

2 

(.94)b 

(.00)b 

Persuasive  Message/  Contrast 

0 

1 

— 

Information  Plus  Contact 

5 

9 

(.31)b 

(.77)^ 

Information  Plus  Vicarious 

16 

5 

.07 

.15 

Systematic  Desensitization 

2 

(.ll)b 

(.00)^ 

Positive  Reinforcement 

0 

0 

Other 

6 

6 

(.26)b 

(.68)b 

Total 

57 

56 

.16 

.41 

^Based  on  marginal  frequenci 

es  rounded 

to  whole 

numbers,  so 

total 

does  not  equal  57. 

'Based  on  too  few  Ds  to  be  intwrpretable. 
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assessing  behavioral  change.  This  is  contrary  to  Rokeach's  (1973)  own 
research  in  which,  for  example,  he  analyzed  responses  to  an  NAACP  membership 
solicitation  from  three  to  five  months  after  an  experimental  treatment,  and 
then  a  year  later,  to  determine  behavioral  attitude  effects  (also  see  Ball- 
Rokeach,  Rokeach,  &  Grabe,  1984). 

Length  of  Treatment 

Among  the  prior  reviewers  of  the  research  on  the  modification  of 
attitudes  toward  persons  with  disabilities,  Anthony  (1972)  commented  that 
length  of  treatment  might  be  an  important  study  characteristic.  Information 
on  the  total  number  of  hours  of  treatment  was  available  for  545  (84%)  of  the 
effect  sizes  in  our  data  set. 

Length  of  treatment  varied  considerably,  from  .10  hour  to  over  1,000 
hours.  The  mean  number  of  hours  of  treatment  was  37.14,  with  a  standard 
deviation  of  127.95.  But  particularly  revealing  are  the  median  number  of 
treatment  hours,  4.00,  and  the  mode  of  .7  b^"^s — about  the  length  of  a 
typical  class  period. 

For  the  545  effect  sizes  for  which  the  number  of  hours  of  experimental 
treatment  was  available,  there  was  essentially  no  relationship  between  length 
of  treatment  and  outcomes  (r  =  .02).  But  as  can  be  seen  in  Table  53,  that 
overall  correlation  obscures  an  apparent  interaction  between  type  of 
technique  and  length  of  treatment.  For  Information  and  Persuasive  Message, 
there  were  moderate  negative  associations  (information  becomes  boring?). 
The  coefficient  of  .60  for  Systematic  Desensitization  is  particularly 
intriguing.  It  makes  sense  that  the  effects  of  desensitization  would 
increase  with  length  of  treatment — to  a  certain  point.  There  were  no  length 
of  treatment  outliers  for  that  technique,    and  the  number  of  hours  of 


Table  53 

Correlations  Between  Length  of  Experimental  Treatment 
(Total  Hours)  and  Magnitude  of  D 


All  Available  Data  Without  Outliers 


Technique 

N 

r 

r2 

N 

r  r2 

Information 

179 

-.21 

.05 

171 

-.04         — ^ 

Direct  Contact 

64 

-.04 

 a 

57 

.09  .01 

Vicarious  Experience 

58 

.06 

 a 

No  extreme  outliers 

Persuasive  Message 

23 

-.28 

.08 

19 

-.08  .01 

Persuasive  Message,  Contrast 

lib 

.40 

.16 

No 

outliers 

Information  Plus  Contact 

71 

-.03 

 a 

70 

-.20  .04 

Information  Plus  Vicarious 

52 

-.04 

 a 

51 

.05        — ^ 

Other 

64 

.11 

.01 

63 

.29  .09 

Systematic  Desensitization 

21 

.60 

.36 

No 

outliers 

Total 

545 

.02 

 a 

431 

.01         — ^ 

^r^  less  than  .005. 
10  Ds  came  from  the  same  study. 
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treatmert  were  clustered  from  5  hours  and  less.  For  the  other  techniques,  no 
relationship  was  evident. 

Visual  inspection  of  scatter-diagrams  did  not  reveal  any  curvilinear 
relationships,  but  it  did  indicate  that  a  few  outliers  might  be  exerting  an 
inordinate  effect  on  some  coefficients.  Consequently,  correlations  were 
rerun  with  outliers  excluded.  These  coefficients  are  also  in  Table  53.  The 
basic  effect  was  to  dampen  the  relationships.  For  Information,  the 
coefficient  dropped  from  -.21  to  ^.04;  for  Persuasive  Message,  the 
coefficient  dropped  from  -.28  to  -.08.  However,  in  two  cases,  the 
association  increased:  from  -.03  to  -.20  for  Information  Plus  Contact,  and 
from  .11  to  .29  for  Other.  Although  the  relationship  of  length  of  treatment 
to  outcomes  is  generally  small,  the  effect  did  not  appear  to  be  uniform 
across  attitude  modification  techniques,  and  it  is  not  a  variable  to  be 
totally  ignored. 

Context 

The  coding  instrument  contained  a  category  for  Treatment  Context,  the 
general  milieu  or  environment  within  which  a  study  was  conducted.  As  can  be 
seen  in  Table  54,  the  effect  sizes  came  largely  from  two  context^/  College- 
University  and  Elementary-Secondary  Schooling.  Those  two  contexts  account 
for  85  percent  (N  =  549)  of  the  effect  sizes. 

The  highest  percentage  of  effect  sizes  (49%;  N  =  314)  came  from  studies 
carried  out  in  a  college  or  university  environment;  the  second  most  frequent 
environment  (36%;  N  =  235)  was  elementary  and  secondary  schooling.  The  large 
number  of  studies  in  both  categories,  but  particularly  Elementary-Secondary 
Schooling,  may  well  oe  primarily  a  function  of  PL. 94-142,  passed  in  1975. 
There  was  a  dramatic  810  percent  increase  in  effect  sizes  from  Elementary- 


Table  54 
Treatment  Contexts 


Effect  Sizes 

Context 

M 

IN 

SD 

Elementary-Secondary  Schooling 

235 

.38 

College-University 

314 

.40 

Inser'/ice 

54 

.24 

.42 

Adult  Education 

3 

(.05) 

(.40) 

Work 

9 

(.12) 

(.67) 

Contnunity 

8 

(.50) 

(.38) 

Recreation 

7 

(.33) 

(.30) 

Other 

14 

.49 

.65 

Total 


644 


,37 


.61 


Note.  For  mean  Ds  and  standard  deviations  in 
parentheses/  the  number  of  effect  sizes  is  less  than  10 
and  too  few  to  interpret. 

Eta^  =  .01. 
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Secondary  Education  context  studies  from  1971-75  (N  =  10)  to  1976-80  (N  = 
91)/  with  another  40  percent  increase  to  the  1981-85  period  (N  =  127).  The 
increase  in  the  effect  sizes  coming  from  Col lege-Uni varsity  co^^text  studies 
was  not  as  dramatic  from  1971-75  (N  =  63)  to  1976-80  (N  =  143),  a  127  percent 
change.  And,  there  was  a  surprising  63  percent  drop  from  1976-80  (N  =  143) 
to  1981-85  (N  =  53).  Our  data  yield  no  insights  into  the  reason  for  that 
drop.  Speculatively,  it  may  be  in  part  due  to  a  shift  in  researchers' 
interests,  with  greater  availability  of  funds  for  studies  at  the  elementary- 
secondary  level  where  the  thrust  for  "educat  on  for  all  handicapped  children" 
has  been  greatest. 

There  were  10  or  r.ore  effect  sizes  for  only  four  treatment  contexts — the 
two  noted  above  and  "Inservice"  education  or  training,  with  N  =  54  (8%)  and 
"Other"  (N  =  14;  2%).  Only  the  means  for  the  first  two  contexts  are 
presented  in  Table  55,  because  of  the  few  effect  sizes  for  the  second  two. 
The  overall  means  for  the  Elementary-Secondary  Schooling  and  College- 
University  contexts  (.38  and  .40,  respectively)  are  remarkably  similar; 
however,  the  Inservice  mean  of  ,24  (not  in  Table  55)  is  considerably  lower 
than  both,  although  the  differences  are  not  much  above  cne  criterion  for 
triviality  (.12).  It  is  of  some  interest  that  the  Inservice  Ds  come 
bas-jally  from  three  sources:  17  are  from  studies  with  elementary  school 
teachers  (mean  D  =  .13),  13  are  from  studies  with  institutional  employees 
(mean  D  =  .16),  and  21  are  from  inservice  with  other  groups  (mean  D  =  .40). 
(The  other  three  Ds  are  accounted  for  by  inservice  with  special  education 
teachers  [N  =  1]  and  police  [N  =  2].) 

Although  .he  Slementary-Secondary  Schooling  and  the  College-University 
mean  Ds  are  almost  identical,  there  are  some  interesting  within-treatment 
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Table  55 

Treatment  Technique  Outcomes  in 
Elementary-Secondary  Schooling  and 
College-University  Contexts 


Elementary-Secondary  College-University 


Technique 

N 

Mean 

SD 

N 

Mean 

SD 

Persuasive  Message 

9 

(.80)3 

(.78)3 

12 

.56 

.39 

Information  Plus  Contact 

32 

.57 

.60 

47 

.55 

.73 

Direct  Contact 

22 

.69 

1.13 

46 

.  36 

.53 

Vicarious  Experience 

25 

.30 

.48 

32 

.51 

.91 

Other 

27 

.49 

.56 

28 

.33 

.83 

Systematic  Desensitization 

21 

.32 

.44 

Information 

90 

.24 

.42 

94 

.35 

.62 

Information  Plus  Vicarious 

30 

.20 

-  .2 

21 

.28 

.36 

Persuasive  Message/  Contrast 

11 

.13 

.33 

Positive  Reinforcement 

2 

1.74 

.01 

Total 

235 

.38 

.60 

314 

.40 

.65 

^oo  few  effect  sizes  (less  than  10)  to  be  interpretable. 
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differences  in  mean  £s  for  the  two  contexts  (see  Table  55).  Of  particular 
#  interest  are  the  mean  D^s  for  Direct  Contact;   for  which  the  Elementary- 

Secondary  mean  is  higher  (.69  versus  .36/  a  .33  difference)  and  for  Vicarious 
Experience/   for  which  the  situation  is  reversed  with  the  College-University 

^  mean  the  higher  one  (.51  vs.  .30/  a  .21  difference).    Only  nine  Persuasive 

Message  and  Systematic  Desensitiza';ion  effect  sizes  came  from  studies 
conducted  in  an  Elementary-Secondary  Schooling  context. 

'  In  short/   most  studies  from  which  the  644  effect  sizes  came  were 

conducted  in  public  school  or  higher  education  contexts/  with  very  few 
carried  out  in  other  environments/  such  as  at  places  of  employment  or  in 
recreation  contexts..  For  those  two  major  context  categories/  the  mean  £s 
were  similar/  although  both  differed  somewnat  fi.om  the  meai.  ^  for  the 
Inservice  context/  the  only  other  context  with  10  or  more  Ds,  other  than  the 

'  catch-all  "Other"  with  14.    Tl.ere  is  some  indication  of  different  effects  for 

Direct  Contact  and  Vicarious  Experience  within  the  two  major  context 
categories/  suggesting  that  context  is  another  source  of  study  variability 
that  should  not  be  ignored. 

Setting 

I  Related  to,  but  scmewhat  different  from  context  is  the  setting  of  the 

research.  While  "context"  refers  to  general  environment/  "setting"  was 
defined  as  the  specific  type  of  place  '/here  the  research  was  conducted.  It 
is  another  potential  source  of  variability  in  study  characteristics. 

Given  the  findings  to  this  point/  it  will  come  as  no  surprise  that  49 
percent  of  the  effect  sizes  (N  =  314)  came  from  studies  carried  out  in 
regular  classrooms  (see  Table  56)/  with  57  percent  (N  =  180)  in  public  school 
classrooms  and  35  percent  (N  =  109)  in  higher  education  classrooms.    As  would 
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Table  56 
Setting  of  TL^eatments 


Effect  Sizes  (Ds) 


Setting 

N 

Mean 

SD 

Regular  Classroom 

314 

.38 

.61 

Special  Education  Classroom 

3S 

.21 

.68 

Institution 

22 

.34 

.77 

Hospital 

28 

.39 

.51 

Lafc>oratory 

23 

.46 

.66 

Individual  or  Small  Group 

31 

.06 

.53 

Normal  Life 

14 

.65 

.65 

Home 

2 

(.27)3 

(.03)° 

Dormitory- 

4 

(.14)3 

(.54)3 

Camp 

6 

(.33/3 

(.33)^ 

Recreation  Facility 

1 

(.33)3 

(.00)3 

Other 

32 

32 

38 

Combination 

71 

.51 

.54 

Can't  Tell 

61 

.39 

.71 

Total 

644 

.37 

.61 

Note.    Eta2  =  .03. 

^Too    few    effect  sizes 

( less 

than  10) 

to  be 

interpretable. 
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be  expected*  ill  of  the  23  D»s  from  research  in  Labore*-ory  settings  (15  of 
0  which  were  for  Systematic  Desensitization  as  a  technique)  came  from  the 

College-University  Context,  as  did  all  but  five  of  the  Individual  or  Small 
Group  setting  Ds,     The  low  mean  £  (•06)  for  Individual  or  Small  Group 

#  settings  is  largely  due  to  a  mean  D  of  -,42  for  10  Dis  that  came  from  the 
Other  category  of  treatment  techniques. 

With  the  predominance  of  Regular  Classrooms  as  the  research  setting  from 

•  which  the  644  effect  sizes  came,  the  numbers  of  other  settings  within  the 
various  treatments  are  in  general  quite  small.  The  few  worth  mention  are 
primarily  those  in  which  the  pre.^ence  of  persons  with  disabilities  makes  the 

•  setting  particularly  appropriate  for  either  Contact  or  Contact  plus 
Information  studies — Special  Education  Classrooms  (13  Contact  D^j), 
Institutions  (11  Contact  plus  Information  £s).  Hospitals  (19  Contact  Ds  and  9 

*  Information  plus  Contact  Ds).  So,  Regular  Classrooms  dominate  the  settings, 
but  other  settings  contribute  to  variability  in  study  characteristics  both 
within  and  across  types  of  attitude  modification  techniques. 

Sample  Sizes 

Data  on  the  size  of  the  experimental  treatment  :,.oup  were  available  for 
^  642  effect  sizes  (see  Table  57).     While  the  range  in  the  size  of  the 

t'^eatment  groups  from  which  those  effect  sizes  came  is  large,  the  modal  group 
was  relatively  small,  N  =  20,  and  the  median  not  much  larger,  N  ^  Sample 
^  sizes  for  different  techniques  were  fairly  similar,  except  for  the  small 

Persuasive   Messages   mode   (6) — which   was,    however,    accompanied  by  a 
substantial  median  (30) — and  the  high  mode  (58)  for  Direct  Contact  studies. 
1^  The  differences  in  ranges  of  sample  sizes  are,   however,   striking.  Although 

the  minimum  Ns  are  similar,  the  Persuasive  Message  (6-44)  and  Systematic 
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Table  57 

Sizes  of  Experimental  Treatment  Groups 


Technicjue 

Sample  Size  Statistic 

s 

N 

Mean 

Median 

Moc^ 

ou 

Range 

Persuasive  Message 

23 

25 

30 

6 

13 

6- 

-44 

-,20 

Information  Plus  Contact 

79 

50 

26 

26 

96 

7- 

■862 

-.11 

Direct  Contact 

93 

69 

38 

58 

104 

6- 

-663 

-.005 

Vicarious  Bxperience 

58 

35 

26 

20/52^ 

28 

7- 

-198 

-.19 

Other 

71 

32 

20 

20 

29 

6- 

-148 

.31 

Systematic  Desensitization 

21 

26 

21 

20/45^ 

13 

13- 

•45 

.02 

Information 

202 

43 

26 

20 

58 

8- 

•544 

.03 

Information  Plus  Vicarious 

62 

73 

37 

31 

98 

13- 

■424 

.04 

Persuasive  Message/  Contrast 

11 

20 

19 

20 

8 

13- 

'45 

 c 

Positive  Reinforcement 

2 

10 

10 

10 

.00 

Total 

642 

47 

28 

20 

73 

5- 

862 

-.02 

Note.  Statistics  rounded  to  whole  numbers  to  reflect  that  samples  are  constituted  of 
whole  persons.    Because  of  roi^nding,   totals  may  not  agree  exactly. 

^Correlation  coefficient  for  sample  size  and  D. 
"Bi-modal  distributions. 

^Coefficient  not  reported  because  10  of  the  11  datj  points  are  from  one  study. 
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Message  (13-45)  studies  had  much  smaller  maximum  treatment  group  Ms  than  the 
others/  with  the  largest  maximum  Ns  for  Direct  Contact  and  Informcition  Plus 
Contact  (663  and  862/  respectively).  Those  differences  undoubtedly  reflect 
the  predominance  of  laboratory  stucies  for  the  first  two  techniques  and 
course  and  program  evaluation  studies  for  the  latter  two. 

The  correlation  between  Ds  and  experimental  group  Ns  was  practically  nil 
for  the  total  data  set  (r  =  -•02).  For  individual  treatment  techniques  with 
sufficient  numbers  of  £s  to  compute  a  coefficient/  the  r's  ranged  from 
essentially  zero  (Information  [.03]/  Direct  Contact  [-,005]/  Information  Plus 
Contact  [-,11]/  Information  Plus  Vicarious  Experience  [.04]/  and  Systematic 
Desensitization  [.02])  to  low  and  negative  (Vicarious  Experience  [-.19]/ 
Persuasive  Messages  [^.20])/  or  low  and  positive  (Other  [.31]).  In  the  case 
of  the  Other  category/   two  outliers  (high  Ds  and  Ns)  boosted  the  r. 

Limited  ranges  of  Ns  do  not  appear  to  account  for  the  low  coef f irtients/ 
nor  do  differences  in  ranges  of  Ns  for  different  techniques  (Table  54)  appe.^.r 
to  be  a  factor  in  the  differing  coefficients.  The  low  negative  coefficients 
for  Persuasive  Message  and  Vicarious  Experience  suggest  that  the  size  of  the 
treatment  groups  might  be  a  factor  worth  further  consideration  with  those 
treatments.  Overall/  however/  it  does  not  appear  that  differences  in 
treatment  group  Ns  were  systematically  related  to  the  outcomes  with  different 
attitude  techniques. 

Summary 

Variability  in  study  characteristics  othec  than  treatment  features  are 
another  source  of  variability  among  studies  grouped  for  analysis  in  meta- 
analytic  integrative  reviews.  All  such  differences  ought  to  be  kept  in  mind 
as  posing  potential  "apples  and  oranges"  difficulties  in  interpretation/  even 
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though  some  do  not  appear  to  interact  systematically  with  treatments  to 
affect  outcomes.  For  the  study  characteristics  discussed  above  (type  of 
comparison,  time  of  posttest,  type  of  dependent  measure,  context,  setting, 
and  sample  size),  there  appear  to  have  been  few  systematically  different 
effects  ior  different  treatments,  and  those  that  are  discernible  are  often 
confounded  with  other  sample  attributes.  Nevertheless,  variability  in  study 
characteristics  should  be  an  ever-present  consideration  as  our  results  are 
reviewed.  Further  analyses  are  warranted  in  an  attempt  to  ferret  out  the 
influences  which  variations  in  such  study  characteristics  had  on  study 
outcomes • 

Sample  Characteristics 
In  prior  sections,  some  variations  in  treatment  features  within  attitude 
modification  technique  categories  and  in  other  study  characteristics  have 
been  presented.  Next,  information  will  be  presented  on  some  attributes  of 
sample  Ss  that  were  coded  because  they  might  influence  outcomes  or  be 
important  to  generalization. 

Sample  Selection 

Information  on  sample  selection  does  not  provide  direct  evidence  of  Ss* 
characteristics.  However,  the  methods  by  which  subjects  are  obtained  may 
well  influence  the  nature  of  the  sample  and,  thereby,  both  the  research 
outcomes  and  their  generalizability.  The  use  of  convenient  samples  in 
psychological  research — frequently  sophomores  in  psychology  classes — has 
often  been  commented  on.  The  equivalent  in  educational  research  is  the 
"intact"  group — the  conveniently  available  classroom  of  students.  The 
effects  of  using  convenient,   or  intact,   groups  has  not  been  systematically 
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investigated  (except  for  the  impact  on  statistical  inference,  e.g.,  Hopkins, 
1982).  Another  selection  technique,  the  solicitation  of  volunteers,  has  been 
addressed  extensively  by  Rosenthal  and  Rosnow  (1975;  Rosnow  &  Rosenthal, 
1976).  Their  summary  of  the  research  eviduice  indicated  that  volunteers  for* 
research  projects  do  tend  to  differ  from  nonvolunteers  in  ways  that  could 
have  significant  implications  for  research  on  modifying  attitudes  toward 
persons  with  disabilities.  For  example,  volunteers  are  likely  to  be  more 
intelligent,  higher  in  need  for  social  approval,  and  less  authoritarian  than 
nonvolunteers. 

As  might  be  expected  (see  Table  58);  the  samples  for  the  studies  which 
yielded  our  644  effect  sizes  came  from  two  major  sources:  volunteers  (N  = 
220;  34%)  and  intact  groups  (N  =  294;  4f%),  together  accounting  for  80 
percent  (N  =  514)  of  the  effect  sizes.  For  the  other  20%,  random  selection 
of  individuals  (N  =  31)  or  of  groups  then  used  as  the  unit  of  analysis  (N  = 
31)  accounted  for  62  effect  sizes  (10%);  for  33  effect  sizes  (5%),  selection 
was  categorized  as  "Other";  and,  for  35  effect  sizes  (5%),  the  method  of 
sample  selection  could  not  be  identified. 

The  mean  £s  for  the  different  sample  selection  methods  are  somewhat 
perplexing,  especially  the  mean  £  of  .25  for  Volunteers,  as  contrasted  to  a 
mean  D  of  .42  for  Intact  Groups  and  .53  for  Random  Samples.  Wouldn't 
volunteers  be  expected  to  respond  more  favorably  to  efforts  to  modify 
attitudes  toward  a  minority  group?  Alternatively,  however,  volunteers  may 
have  come  into  modification  programs  with  attitudes  already  so  positive  that 
experimental  effects  were  dampened.  In  that  regard,  as  did  Rosenthal  and 
Rosnow  (1975,  e.g.,  p.  49),  we  included  as  volunteer  subjects  not  only  those 
who  responded  to  a  solicitation  to  participate  in  a  research  project  but 
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Table  58 

Methods  of  Sample  Selection 
by  Accicude  Hodification  Technique 


Selection 


Can»t 

Random 

Intact 

-Technique 

Tell 

Samples 

Volunteers 

Grt5ups 

Other 

persuasive  Message 

(.35) 

( .60) 

I .  / 

.67 

mmmm 

(.11) 

(.46) 

( .17) 

\  •7b) 

•  DO 

6 

9 

5 

3 

23 

Information  Plus  Contact 

(1.38) 

.67 

\  •  JO} 

.  51 

(.00) 

.60 

.98 

.61 

( .01) 

.bb 

1 

11 

12* 

74** 

2 

100 

Direct  Contact 

(.34) 

— 

.28 

.49 

(.17) 

— 

.47 

.82 

.73 

5 

21* 

67** 

93 

Vicarious  Experience 

(-.09) 

(.83) 

.23 

(.00) 

(.20) 

.75 

.80 

.  to 

1 

7 

30** 

20 

58 

Other 

( .36) 

( .31 ) 

1  (C 
.13 

\  •  /D  ) 

.39 

(.28) 

(.36) 

.53 

.79 

( .53) 

.  b4 

B 

4 

39** 

15* 

5 

71 

Systematic  Desensitization 

— 

— 

.33 

.  30 

. 

.42 

.55 

A  A 

— 

— 

16 

5 

21 

Information 

.43 

.44 

.20 

.23 

.47 

.29 

.36 

.38 

.50 

.  56 

.43 

.  Dl 

10 

31 

47* 

97 

18 

inLOLmacion  rius  vicaLious 

.07 

(  00) 

.19 

.20 

.20 

Experience 

.14 

(.36) 

.40 

.26 

.36 

10 

3 

33 

16* 

62 

Persuasive  Message ^  Contrast 

.13 

.13 

.33 

.33 

11 

11 

Positive  Reinforcement 

(1.74) 

1.74 

(.01) 

.01 

2 

2 

Total 

.31 

.53 

.25 

.42 

.59 

.37 

.34 

.42 

.60 

.67 

.60 

.61 

35 

62 

220 

294 

33 

644 

Note.    Eta^  for  sample  selection  method  and  D 


.04. 


*At  least  10  fewer  cases  than  expectedi  based  on  marginal  frequencies. 
**At  least  10  more  cases  than  expectedi  based  on  marginal  frequencies. 
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unose  who  volunteered  to  participate  in  activities  with  disabled  pei'sons 
without  knowing  they  were  to  be  part  of  a  research  project.  Such  persons 
might  have  had  even  higher  initial  attitudes  than  would  be  expected  on  the 
general  basis  of  estimated  volunteers*  traits. 

The  effects  of  prior  attitudes  on  outcomes  could  not  be  investigated 
with  our  data  set  because  there  were  practically  no  reports  of  analyses  in 
which  pretreatment  attitudes  were  included  as  an  independent  variable,  with 
results  reported  by  levels  of  antecedent  attitudes.  The  hypotheses  that 
volunteers  have  higher  antecedent  attitudes,  with  higher  pretreatment 
attitudes  associated  with  less  change,  suggested  by  our  Selection  Method 
finding,  could  be  given  a  rough  check  by  coding  pretreatment  attitude  levels 
for  volunteers  and  nonvolunteers  in  different  studies  and  then  comparing  Ds 
for  the  two  groups.  That  \b  among  the  many  interesting  questions  that  have 
arisen  from  the  analyses  reported  here  that  call  for  further  use  of  our 
extensive  data  set  and  collection  of  research  reports. 

Of  more  concern  in  this  section  is  whether  sample  selection  was  related 
to  attitude  modification  technique  outcomes.  An  answer  is,  of  course,  made 
difficult  by  the  confounding  of  variables.  For  example,  as  indicated  in 
Table  58,  intact  groups  were  more  heavily  represented  than  volunteers  in 
Information  Plus  Contact,  Direct  Contact,  and  Information  effect  sizes,  with 
the  Volunteer  mean  £  lower  in  each  case.  But  47  percent  of  the  Information 
Plus  Contact  effect  sizes,  49  percent  of  the  Direct  Contact  effect  sizes,  and 
46  percent  of  the  Information  effect  sizes  came  from  college  and  university 
samples,  ^hich  are  more  likely  to  be  obtained  through  solicitation  of 
volunteers  than  are,  for  example,  elementary  and  secondary  school  samples, 
(While  40%  of  the  college  and  university  effect  sizes  came  from  volunteer 


samples/  only  22%  of  the  elementary  and  secondary  school  effect  sizes  did; 
or/  conversely/  while  57%  [N  =  125]  of  the  effect  sizes  that  came  from 
volunteer  samples  [N  =  22"^]  were  for  college  or  university  students/  only  24% 
[N  =  53)  were  for  elementary  or  secondary  school  students.)  The  possible 
effects  of  methods  of  sample  selection  are  a  relevant  consideration  in  any 
interpretations  of  our  descriptions  of  our  data  set/  even  though  sample 
selection  is  confoundea  with  other  variables.  The  nature  of  those  effects  is 
also  a  viable  topic  for  further  research  in  this  attitude  modification  field. 

Grade-Age  Levels 

The  age  of  the  Ss  was  a  sample  characteristic  that  we  presumed  might  be 
related  to  treatment  outcomes  and  would  be  of  interest  to  our  readers. 
However/  in  coding  studies  to  try  out  the  coding  instrument  during  its 
development/  we  found  that  few  authors  reported  the  age  of  their  Ss. 
Consequently/  information  of  the  schooling  grade  level  of  Ss  was  coded 
because  it  is  a  fairly  close  proxy  for  age/  at  least  through  the 
undergraduate  years  of  college.  Information  from  that  coding  is  presented  in 
Table  59. 

As  has  been  noted  already  in  the  discussion  of  the  contexts  within  which 
the  attitude  modification  studies  were  conducted/  85  percent  of  the  effect 
sizes  (N  =  549)  came  from  studies  carried  out  in  elementary-secondary  school 
and  college-university  environments.  Those  figures  are  reflected  in  Table 
59/  with  39  percent  (N  =  254)  of  the  Ds  coming  from  studies  conducted  with  Ss 
from  preschool  through  high  school  and  another  43  percent  coming  from  studies 
with  undergraduates  (N  =  253)  and  graduaf*^  students  (N  =  29).  It  is  worth 
noting/  too,  that  43  percent  (N  =  110)  of  the  elementary-secondary  level  Ds 
are  in  the  Intermediate  category  (grades  4-6)/   with  another  10  percent  (N  = 
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Table  59 

Educational-Age  Levels  of 
Experiinental  Treatment  Ss 


Effect  Sizes  (Ds) 


Level 

N 

Mean 

SD 

Preschool 

3 

(.62)a 

(1.06)a 

Primary 

30 

.48 

.50 

Intermediate 

110 

.34 

.69 

Middle  School 

18 

.33 

.42 

Junior  High 

13 

.41 

.26 

Senior  High 

18 

.42 

.56 

Combination 

62 

.40 

.46 

Subtotal 

254 

Undergraduate 

253 

.43 

.69 

Graduate 

29 

.33 

.45 

Postprofessional 

52 

.23 

.S3 

Adult  Not  in  School 

47 

.19 

.43 

Can't  Tell /Other 

9 

(.57)^ 

(.73)^ 

Total 

644 

.37 

.61 

Note.     Eta2  =  .02. 

^Too  few  effect  sizes  (less  than  10)  to  be 
interpretable. 
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62)  in  the  Combination  category/  with  many  combinations  that  included 
Intermediate  Ss. 

In  terms  of  mean  2s/  the  most  apparently  interesting  aspect  of  Table  59 
is  the  difference  between  the  mean  £  for  the  youngest  Ss  represented  by  at 
least  10  effect  sizes,  the  Primary  Ss  (grades  K-3;  mean  D  =  .48)  and  the 
older  Ss,  with  mean  Ds  of  .33  (Graduate)/  .23  (Postprof essional) /  and  .19 
(Adult  Not  In  School).  Of  the  30  Ds  for  Primary  Ss,  14  are  from  Information 
study  effect  sizes — the  only  treatment  technique  with  10  or  more  D^s — with  a 
mean  £  of  .32.  The  mean  £  of  .48  is  the  result  of  6  Primary  £s  in  the 
Information  Plus  Contact  category  with  a  mean  of  .78.  In  short/  the 
apparently  large  mean  differences  are  based  on  the  influence  of  a  small 
subset  of  £s.  With  that  in  mind/  the  mean  £s  for  various  grade-age  levels 
are/  overall/  surprisingly  similar/  with  the  exception  of  the  two  adult 
groups  with  the  lowest  means — Postprof ei:,cixonal  (.Tiean  £  =  .23)  and  Adult  Not 
in  School  (mean  £  =  .19).  This  similarity  is  captured  by  the  Eta^  of  .02  for 
Educational-Age  Levels  and  Ds. 

What  about  differences  in  grade-age  level  by  treatments?  Table  60 
contains  mean  £s  for  treatment  techniques  at  each  grade-age  level. 
Purposely/  only  means  have  been  included  for  which  there  were  at  least  10 
effect  sizes  to  make  more  graphic  the  pattern/  including  absences/  of  niean 
£s.  As  noted  in  earlier  sections/  Information  is  clearly  the  most 
investigated  technique/  followed  by  Information  Plus  Contact.  Just  as 
clearly/  the  findings  come  primarily  from  Ss  in  the  Intermediate  grades  (with 
many  grade  Combinat^.ons  including  intermediate  Ss)  and  from  Undergraduates. 

A  few  treatment  differences  by  grade-age  level  are  worth  noting.  For 
example/  "adult"  (Graduate/  Postprof essional/  and  Adult  Not  In  School)  Ss 
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Table  60 

Mean  Op  for  Tceacrnenc  Techniques  and  Grade-Age 
Levels  wich  Ac  Lease  lO  Effect  Sizes 


Grade-Age  Level 


to 


Technique 
Persuasive  Message 
In forma c ion  Plus  Concacc 
Direct  Contact 
Vicarious  Experience 
Other 

Systematic  Desensitizaticn 
Infonnation 

Information  Plus  Vicarious 
Experience 

Persuasive  Message,  Contrast 
Positive  Reinforcement 


Middle     Junior  Senior 

Preschool     Primary     Intermediate     School       High        High      agination     Undergrad-^te     Graduate     Postprofessional     Adult  Other 


.32 
14 


.59 
15 


.81 

16 


.25 
17 


.20 
13 


.20 

36 

.12 
13 


.20 
10 


.69 

36 

.20 

.40 

12 

44 

.46 

25 

.32 

26 

.16 

15 

.44 

.37 

12 

79 

.21 

22 

.09 
12 


.19 

27 


.22 
13 


.29 
11 


iiote.    The  first  nutrtoer  in  each  cell  is  the  mean  D,  the  second  nutrber  is  N.    Only  means  based  on  at  least  10  effect  sizes  are  included. 


ERIC 


272 


273 


received  lower  mean  Ds  uhan  other  Ss  for  Information  and  Information  Plus 
Contact/  but  not  for  Other  (technique  combinations  other  than  those  labeled 
in  the  table).  Also/  the  Direcr  Contact  mean  D  for  Intermediate  Ss  is 
strikingly  higher  than  those  for  Combination  and  Undergraduate  Ss.  The  mean 
differences  of  .41  and  .61/  respectively/  are  just  below  and  above  the 
criterion  of  .50  (see  Chapter  3)  for  high  practical  significance.  And/  t^s^ 
opposite  result  is  seen  for  Vicarious  Experience/  with  the  Intermediate  mean 
D  (.25)  .21  lower  than  that  (.46)  for  Undergraduates. 

All  in  all/  the  information  in  Table  60  suggests  the  caution  that  is 
necessary  to  avoid  overgeneralizing  the  findings  in  the  literature  to 
differing  grade-age  levels.  Further  analysis  of  our  data  will  be  undertaken 
to  determine  if  other  sample  or  study  characteristics  explain  the 
differences.  No  analysis/  however/  can  make  r-^  for  the  number  of  treatment 
technique--grade-age  level  combinations  for  which  adequate  numbers  of  Ds  are 
lacking. 

Gender 

Prior  reviewers  (e.g./  Horne/  1985/  pp.  132/  143)  have  indicated  that 
females  tend  to  have  more  positi  attitudes  toward  persons  with 
disabilities/  and  that  they  may  be  more  likely  to  change  attitudes  in  a 
positive  direction.  In  order  to  determine  if  gender  was  an  important  factor 
in  study  outcomes/  we  recorded  the  percentage  of  males  in  the  experimental 
group  whenever  information  of  gender  composition  was  available  in  the 
research  report. 

Percentages  of  males  in  the  experimental  groups  are  presented  in  Table 
61  for  the  339  effect  sizes  for  which  that  information  was  "vailable  in  the 
reports.     Although^  the  mean  and  median  percentages  for  the  total  339  effect 
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Table  61 

Percentages  of  Males  in 
Experimental  Groups 


Statistics 


IN 

rl^^cu  1 

Mode 

cn 

Range 

Persuasive  Message 

14 

21 

16 

(Xj 

23 

0-55 

Information  Plus  Contact 

57 

28 

23 

00 

25 

0-84 

Contact 

38 

28 

29 

00 

19 

0-Gl 

Vicarious  Experience 

21 

34 

46 

00 

28 

O-lOO 

Other 

29 

44 

4.-> 

40 

11 

18-73 

Systematic  n<aG<an^i  ^^  ion 

16 

11 

7 

00/7* 

1  A 

Information 

115 

40 

46 

00/50* 

27 

O-lOO 

Information  Plus  Vicarious 

37 

41 

52 

00 

29 

O-lOO 

Persuasive  Message,  Contrast 

10 

40 

45 

 ** 

16 

21-60 

• 

Positive  Reinforcement 

2 

30 

30 

30 

00 

Total 

339 

35 

36 

00 

25 

O-lOO 

• 

Note.    Statistics  rounded  to 
agree  exactlyc 

whole  percentages/ 

so  columns  and 

totals 

n.ciy  not 

*Bi-modal  distribution.. 
**Multi~modal/  with  only  2  Ds 

for  each 

%. 
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sizes  are  nearly  identical  (35%  and  36%/  respectively)/  tho  mode  is  zero. 
That  is/  the  most  frequent  occurrence  was  to  have  no  males  in  the  treatment 
group.  That  mode  is  consistent  across  treatment  techniques  as  well/  with  the 
exception  of  the  Information  and  Persuasive  Messages/  Contrast  modes.  Within 
treatments/  Systematic  Desensi tization  stands  out  as  having  the  most 
restricted  range/  with  33  the  maximum  percentage  of  males  in  a  treatment 
group.  Also/  the  Persuasive  Message  range  is  from  0%  uo  55%^  also  a 
restricted  range. 

Correlations  between  percentages  oZ  males  in  the  experimental  group  and 
Ds  for  the  339  effect  sizes  are  reported  in  Table  62.  Overall/  there  was  no 
relationship  between  percentage  of  males  and  outcomes  (r  =  .00).  The 
coefficients  for  the  various  treatment  techniques  range  from  moderately 
negative*  (r  =  -.47/  r^  zz  .22  for  Systematic  Desensi  tization)*  *  to  low  and 
positive  ''r  =  .31/  r^  =  .09  and  r  =  .27/  r^  =  .07  for  Contact  and  Persuasive 
Messages/  respectively)/  with  most  of  the  coefficients  so  low  as  to  indicate 
negligible  relationships. 

The  picture  is  anything  but  clear/  with  the  overall  lack  of 
relationship/  some  slight/  positive  relationships  for  individual  techniques/ 
and  the  negative  relationship  for  Systematic  Desensi tization/  with  its 
restricted  range.  Surely/  gender  is  a  factor  to  be  considered  by  those  doing 
further  research/  or  attempting  to  apply/  that  technique.    On  the  other  hand, 
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♦Recall  that  %  of  males  was  recorded/  so  a  negative  relationship  indicates 
that  with  more  males  in  the  experimental  group/  Ds  tended  to  be  lower; 
conversely/  a  positive  relationship  indicates  that  "with  more  males  in  the 
experimental  group/  Ds  tended  to  be  higher. 

**The  r  =  -.73  for  Persuasive  Messages/  Contrast  is  being  ignored  in  this 
discussion  because  it  is  based  on  only  10  effect  sizes/  all  from  the  same 
study. 
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Table  62 


Correlations  Between  Percentage  of 
Males  in  Experiiren*  al  Groups  and  Ds 


Statistics 


9 

Technique 

r 

r2 

Persuasive  Message 

14 

.27 

.^7 

# 

Inforniation  Plus  Contact 

57 

-.03 

.00 

Contact 

38 

.31 

.10 

Vicarious  Experience 

21 

.17 

.03 

Other 

29 

.10 

.01 

Systematic  Desensitization 

16 

-.47 

.22 

Information 

115 

-.03 

.00 

# 

Information  Plus  Vicarious 

37 

.03 

.00 

Persuasive  Message/  Contrast 

10 

-.74^ 

..55 

Positive  Reinforcement 

Total 

339 

.00 

.00 

^Because  all  of  the  data  came  from  one  study  with 
10  effect  sizes/  this  coefficient  is  ignored  in 
the  discussion  of  this  table. 
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the  positive,  even  if  low,  coefficients  for  Persuasive  Message  and  Contact 
suggest  that  there  are  situations  in  which  males  may  be  more  responsive  to 
attitude  change  efforts,  even  though  as  a  general  matter  gender  seemed  to 
bear  little  relauionship  to  outcome* 

In  order  to  investigate  gender  effects,  we  coded  two  other  types  of 
informatioru  First,  when  a  complex  analysis  of  variance  was  reported  with 
gender  and  treatment  as  factors,  we  recorded,  as  an  effect  size,  the  Eta^  for 
the  interaction  if  it,  or  informat.-'xi  to  compute  it,  was  available.  Second, 
where  Ds  could  be  computed  separately  for  males  and  females  within  a 
treatment  by  control,  treatment  by  placebo,  single-group,  pre-posttest,  or 
treatment  A  versus  B  comparison,  we  did  so  in  order  to  analyze  those  Ds  for 
differential  treatment  effects. 

The  nean  Eta^  for  the  36  available  treatment  by  gender  interactions 
was  .02,  with  a  standard  deviation  of  .05.  Clearly,  not  much  variance  was 
explained  by  that  interaction,  which  is  consistent  with  the  low  overall  r 
reported  above  and  with  the  lack  of  striking  differences  in  coefficients  by 
treatments.  There  were  too  few  treatment  by  gender  interactions  within  any 
one  treatment  for  interpretation  (the  largest  Ns  were  for  Information,  N  =  8, 
and  Information  Plus  Vicarious  Experience,  N  =  7). 

ir.  wr.G  possible  to  compute  separate  Ds  for  males  and  females  for  24 
comparisons.  The  mean  D  for  females  was  .41  and  for  males,  .33  (the  standard 
deviation  was  .49  for  each).  The  mean  difference  of  .08  is  too  small  to  be 
considered  anything  but  trivial,  again  consistent  with  the  overall  r  of  .00 
for  percentages  of  males  and  Ds.  Ten  of  the  24  Ds  for  males  and  females  *were 
for  the  Information  treatment  technique.  The  means  for  females  and  males 
were  .38  and  .31,    respectively — again  a   trivial   difference   (.07),  and 
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consistent  with  the  Information  r  for  percentages  of  males  and  Ds  of  -.03. 
For  no  other  treatment  technique  was  the  N  for  a  gender  with  treatment 
analysis  greater  than  6. 

In  short/  our  analyses  indicated  little  consistent  evidence  that  gender 
is  related  to  attitude  change  in  our  data  set.  Contradictory  correlations 
were  obtained  for  Persuasive  Messages  and  Contact/  on  the  one  hand  (higher  Ds 
as  %  of  males  increased)  and  Systematic  Desensitization  (lower  Ds  as  %  of 
males  increased)/  on  the  other.  Analyses  of  interactions  between  gender  and 
type  of  attitude  modification  technique  have  not  been  reported  freqrjently  in 
the  literature  and  deserve  more  attention  in  research  and  application 
efforts. 

Prior  Contact 

The  extent  and  type  of  prior  contact  that  Ss  have  had  with  persons  with 
disabilities  is  another  variable  with  potential  power  for  mediating  the 
effects  of  treatments.  Although  that  factor  was  ignored  by  prior  reviewers 
(see  Chapter  2),  we  coded  prior  contact  information  for  our  integrative 
review.  Unfortunately/  however,  the  data  did  not  yield  much  information 
about  the  relations  of  Ss*  prior  contacts  to  treatment  outcomes. 

The  basic  reason  for  the  lack  of  information  is  that  assessment  of  prior 
contact  was  reported  for  only  260  out  of  644  effect  rdzes  (40%).  For  another 
29  effect  sizes  (4%)/  prior  contact  was  implicit — e.g./  inservice  education 
wit!-"  experienced  psychiatric  nurses  or  special  education  teachers.  For  4(5 
(18%)  of  the  289  effect  sizes  for  which  prior  contact  was  assessed  or 
implicit/  no  mention  of  use  of  the  information  was  made  in  the  report.  In 
the  reports  for  125  effect  sizes  (45%)/  the  prior  contact  information  was 
used  only  to  describe  the  samp].e.    In  studies  from  which  22  Ds  came/  prior 
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contact  was  used  to  exclude  Ss  {i^e.,  to  ensure  that  no  Ss  had  prior 
contact);  for  45  £s/  prior  contact  was  used  as  a  criterion  for  inclusion  in 
the  sample?  and/  for  14  £5/  Ss  were  selected  from  strata  of  prior  contact. 

Prior  contact  was  used  as  a  covariate  in  the  analyses  for  27  effect 
sizes;  however/  the  relationship  between  prior  attitudes  and  outcomes  was 
hardly  addressed.  Of  interest/  for  example/  would  be  the  relationship 
between  prior  contact  and  attitude  change  in  the  treatment  group  or,  better 
yet/  a  comparison  of  that  relations'^lp  for  treatment  and  control  groups.  For 
only  4  effect  sizes  was  there  also  a  report  of  the  correlation  between  prior 
contact  and  posttest  attitude  scores  (N  =  2)  or  prior  contact  and  attitude 
change  scores  (N  =  2).  And/  for  only  4  effect  sizes  was  there  a  report  of 
such  correlations  for  a  treatment  and  control  group. 

Of  course/  comparing  treatment-control  group  correlations  is  one  way  of 
determining  if  there  was  a  prior  contact-by-treatment  interaction  effect. 
Alternatively/  prior  contact  could  be  used  with  treatment  as  factors  in  a 
complex  analysis  of  variance  to  determine  prior  contact-treatment 
interactions.  Such  interactions  were  coded  as  secondary  effect  sizes  on  our 
coding  instrument.  Again/  the  information  v;as  too  sparse  to  be  of  use,  with 
only  one  effect  size  (Eta^  =  .02). 

In  short/  although  prior  contact  is  a  potentially  important  variable,  it 
received  so  little  attention  in  our  population  of  resear'^h  reports  that 
nothing  can  be  said  about  the  extent  to  which  it  might  have  mediated 
treatment  effects. 

Personality 

The  characteristics  of  nc  (disabled  Ss  were  mentioned  as  an  important 
theoretical  factor  in  the  discussion/  earlier  in  thia  chapter/  of  Contact  as 
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an  attitude  modification  technique.  Personality  attributes,  such  as 
authoritarianism,  might  well  be  important  factors  in  the  effects  of  attitude 
change  efforts.  Consequently,  we  coded  analyses  of  interactions  between 
levels  of  personality  and  treatment  as  secondary  effect  sizes.  As  with  the 
Concact  studies,  in  which  relationships  between  personality  variables  and 
attitude  change  were  not  analyzed,  only  3  effect  sizes  for  personality  by 
treatment  interactions  were  identified  for  the  total  data  set.  Although  the 
mean  Eta^  was  .01,  there  were  too  few  effect  sizes  to  provide  any  information 
worth  interpreting. 

Summary 

In  this  chaptt^Cr^  we  have  presented  the  major  results  from  the  analyses 
of  the  dafa  from  our  population  of  researr'  reports.  The  intent  has  not  been 
to  arrive  at  a  statistical  estimate  of  an  overall  or  individual  treatment 
effect.  The  purpose  has  been  to  describe  the  major  dimensions  of  the  body  of 
literature  on  modifying  attitudes  toward  persons  with  disabilities  to 
determine  if  there  are  indications  of  effectiveness  and  with  what  variables 
treatment  effectiveness  might  /ary. 

A  summary  of  the  conclusions  of  research  report  authors  in  regard  to  the 
effectiveness  of  the-  treatments  they  had  investigated  indicated  that  it  .  Id 
be  unlikely  that  clear-cut  treatment  effects  would  emerge  from  analyses  of 
the  data  set.  That  expectation  was  born*^  out  when  the  mean  Ds  (pooling 
treatment  by  control,  treatment  by  placebo,  and  single-group/  pre-posttest 
comparisons)  for  each  ol  several  attitude  modification  techniques  were 
compared.  The  mean  £s  could  be  arranged  in  rank  order,  with  considerable 
spread  from  the  highest  to  the  lowest.  However,  the  differences  between  mean 
D^s  were,   for  the  most  part  moderate  to  small,  and  the  heterogeneity  of 
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outcomes,  as  compared  to  the  differences  between  mean  Ds,  was  so  great  that 
no  treatment  could  be  said  to  have  produced  overall  results  clearly  superior 
to  any  other  treatment.  Moreover,  research  with  each  technique  produced  both 
negative  changes  for  treatment  groups  and  negative  effect  sizes,  raising 
further  questions  about  the  uniformity  of  effects. 

To  illuminate  that  heterogeneity  and  some  of  the  variables  that  might  be 
associated  with  it,  we  reported  analyses  of  concomitant  variables:  study 
quality,  treatment  variations,  other  study  characteristics,  and  sample 
characteristics.  None  provided  a  clear  interpretive  path  for  the 
heterogeneous  results. 

Study  quality,  as  indicated  by  treatment  validity,  internal  validity, 
and  dependent  measure  validity,  was  not  associated  with  study  outcomes  to  any 
appreciable  degree.  In  part/  the  lack  of  association  was  due  to  lack  of 
variability,  with  few  of  the  reports  receiving  high  ratings  on  any  of  the 
three  quality  indicators.  It  may  well  be  that  the  ambiguous  and  conflicting 
results  noted  above  and  throughout  the  chapter  are  due  to  the  inadequacies  in 
the  primary  research  studies.  However,  lacking  a  large  number  of  high 
quality  studies,  our  data  cannot  be  used  to  addrecjs  that  issue.  It  can  only 
be  said  that  an  analysis  of  outcor  s  from  medium  and  low  quality  research  has 
not  produced  clearcut  answers  to  the  questions  about  attitude  modification 
that  provided  the  impetus  for  this  integrative  teview. 

The  analyses  indicated  considerable  variability  in  treatment  features, 
other  study  characteristics,  and  sample  characteristics.  But  the 
associations  of  the  variables  with  outcomes  were  in  general  very  low.  There 
were  indications  that  some  study  and  sample  characteristics  were 
differentially  represented  in  the  10  groups  of  treatment  techniques,  with 
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some  differential  effects.    However,  the  confounding  of  variables,  with  some 

nested  within  each  other  as  well  as  within  tre=atments,  made  it  difficult  to 

disentangle  the  nature  of  covariations.     Bangert--Drowns'  (1986)  portrayal  of 

the  genr-val  situation  in  summarizing  psychological  research  provides  an  apt 

summary  of  what  this  chapter  indicates  about  the  research  literature  on 

modifying  attitudes  toward  persons  with  disabilities: 

Research  outcomes  vary  in  vays  that  make  general izable  interpretations 
difficult.  Such  variation  comes  from  a  number  of  sources.  It  may 
reflect  real  population  variation,  the  effects  of  different  treatment 
features  or  study  settings,  sampling  error,  selection  biases  of  the 
reviewer,  publication  biases,  the  effects  of  erroneous  or  insufficient 
reporting  (unreported  spurious  influences,  computational  errors, 
typographical  errors),  differing  degrees  of  validity  and  reliability  in 
the  outcome  measures,  and  differences  in  the  range  or  intensity  of  the 
independent  variable.  The  task  is  enormous,  but  the  power  of  social 
scientific  inquiry  would  greatly  increase  if  patterns  could  be  found 
amid  this  outcome  variation,  (p.  396) 

The  patterns  are  not  yet  clear  for  the  body  of  research  we  have 
reviewed.  But  the  review  in  this  chapter  has  produced  suggestions  for 
further,  and  better,  research  (discussed  further  in  Chapter  7).  The  date  set 
will  itself  be  submitted  to  further  analyses  in  efforts  to  find  regularities. 
The  results  presented  in  the  next  chapter  suggest  alternative  approaches  to 
the  meta-analysis  of  sets  that  may  also  be  worth  pursuing  further. 
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CHAPTER  6 

ALTERNATIVE  AND  SUPPLEMENTARY  DATA  EXPLORATIONS 

One  mildly  frustrating  challenge  in  carrying  out  an  integrative  review 
from  a  quantitative  perspective,  with  a  rich  data  set  gathered  from  a 
heterogeneous  and  fairly  large  population  of  studies  using  a  comprehensive 
instrument/  is  the  selection  of  analyses  to  be  conducted.  The  matter  of 
selection  becomes  particularly  pressing  when  there  are  time  constraints,  as 
is  inevitable  with  a  funded  project.  The  main  analyses,  the  results  of  v;hich 
were  presented  in  Chapter  5,  were  conducted  from  a  meta-analytic  stance 
consistent  with  that  advocated  by  Glass  (see  Chapter  3),  The  major  focus  was 
on  individual  outcome  effect  sizes/  rather  than  on  studies,  and  the  effect 
sizes  were  based  on  mean  differences  as  indicators  of  treatment  effects. 
Alternative  analytic  stances  and  indicators  of  treatment  effects  are 
reflected  in  the  preliminary  results  from  alternative  and  supplementary 
analyses  that  are  reported  briefly  in  this  chapter. 

Median  Effect  Sizes 
As  noted  in  discussing  the  Results  section  of  our  coding  instrument 
(Chapter  3),  there  has  been  criticism  in  the  literature  of  the  meta-arialytic 
strategy  of  using  individual  outcome  effect  sizes  as  the  unit  of  analysis 
when  studies  yield  multiple  effect  sizes.  Recognizing  that  concern,  we 
computed  median  Ds  for  studies  in  order  to  explore  the  effects  of  using  study 
effect  sizes  in  our  dato  analyses.  Whether  preliminary  analyses  would 
indicate  differences  in  results  with  median  Ds  versus  individual  outcome  £s 
was  of  particular  interest. 
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Procedures 

The  synopsis  by  Banger t^Drowns  (1976)  of  computing  a  "study  effect  unit 

of  analysis"  sounds  procedurally  straightforv  ^.rd: 

If  a  study  uses  more  than  one  dependent  measure,  the  corresponding 
effect  sizes  are  either  combined  if  they  represent  the  same  construct 
(e.g.,  academic  achievement)  or  they  are  sent  to  separate  analyses  if 
they  represent  different  constructs  (e.g.,  one  analysis  for  academic 
achievement  data,  another  for  attitude  toward  school  data),    (p.  393) 

However,  the  procedure  becomes  complicated  when,  within  each  study,  the 

construct — in  this  case,  attitudes  toward  persons  with  disabilities — is 

assessed  with  quite  different  types  of  measures,   there  are  intra-study 

replications,   and  there  is  repeated  posttesting.     Encompassing  all  of  these 

variations  in  one  median  effect  size  per  sfudy  is  as  likely  to  lead  to 

misinformation  and  incorrect  conclus  ons  as  is  the  lack  of  independence  of 

multiple  effect  sizes  from  individual  studies. 

As  notecl  above,  in  this  review  the  analytic  focus  was  on  individual 

outcome  effect  sizes.  During  the  funded  portion  of  the  project  being  reported 

here,  adequate  time  was  not  available  to  isolate  and  treat  the  variouiS  median 

Ds  that  hao  been  computed  and  recorded  (by  type  of  assessment  and  across  type 

of  assessment,  by  replication  and  time  of  posttest)  in  a  complete  alternative 

analysis.    Instead,  one  median  effect  size  was  selected  for  each  of  the 

studies  from  which  our  644  treatment  by  control,   treatment  by  placebo,  and 

ingle-group,   pre-posttest  Ds  came.    Used  was  the  overall  median  £  recorded 

for  the  first  effect  size  entered  for  each  study.     In  other  words,  the 

medians  to  be  analyzed  were  for  the  first-recorded  of  any  intra-study 

replications  and  for  immediate,   rather  than  follow-up,  posttest  data.  That 

choice  was  justified  by  the  lack  of  differences  in  mean  Ds  by  replication  or 

time  of  posttesting  (as  reported  in  Chapters  4  and  5).    A  criterion  of  5  was 


used  as  the  minimum  number  of  medians  that  had  to  be  available  for  a  mean  and 
standard  deviation  to  be  considered  adequately  stable  to  be  interpretable  (as 
contrasted  with  10  for  the  individual  effect  size  data). 

Results 

The  644  effect  sizes  in  our  main  analyses  came  from  200  studies. 
Information  on  the  median  Ds  selected  for  those  200  studies  is  presented  in 
Table  63.  As  can  be  seen  in  that  table/  with  median  Ds  (study  effect  sizes) 
as  the  unit  of  analysis/  the  Ns  for  two  more  treatment  techniques* — 
Systematic  Desensi tization  and  Persuasive  Messages/  Contrast — were  not 
adequate  for  interpretation  purposes. 

The  most  interesting  aspect  of  Table  63  is  that  the  use  of  study  effect 
sizes  (median  Ds)  rather  than  individual  effect  sizes  had  little  impact  on 
the  magnitude  of  the  mean  Ds  for  the  treatment  techniques  or  the  rank  order 
of  the  mean  Ds.  The  two  exceptions  are  Systematic  Desensitization/  which  was 
dropped  from  the  ordering  because  of  a  low  N/  and  Other/  for  which  the  mean  £ 
increased  from  .39  to  .59  (with  practically  no  change  in  the  standard 
deviation).  There  is  little  evidence  in  Table  63  that  using  effect  sizes 
from  individual  findings/  rather  than  study  effect  sizes/  influenced  oar 
results  appreciably.  Neve*-theless/  further  analyses  of  our  data  set  using 
median  Ds  are  warranted. 

Outliers 

Extreme  effect  sizes/  or  outliers/  are  a  matter  of  concern  in  conducting 
a  meta-analytic  type  of  integrative  review.    Outliers  may  havt-  an  undue 


*There  was  not  an  adequate  number  of  Ds  for  Positive  Reinforcement  to 
be  considered  in  the  earlier/  individual  outcomes/  analyses. 
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Table  63 


Individual  Finding  and  Median  Study 
Effect  Sizes  (Ds)  for  the  Treatment  Techniques 


Individual  Effect  Sizes  (Dj) 

Median  Effect 

Sizes 

Technique 

Mean 

SD 

N 

Mean 

SD 

Persuasive  Message 

23 

.67 

.56 

7 

73 

.49 

Information  Plus  Contact 

ICO 

.51 

.66 

J? 

55 

.49 

Direct  Contact 

93 

.43 

.73 

35 

.50 

.76 

Vicarious  Experience 

58 

.40 

.76 

19 

.39 

.64 

71 

.39 

.64 

21 

.59 

.6/ 

Systematic  Desensitization 

21 

.32 

.44 

2 

(.46)^ 

(.24)^ 

Information 

203 

.29 

.51 

56 

.22 

.42 

Information  Plus  Vicarious 

62 

.20 

,36 

18 

.23 

.41 

Persuasive  Message,  Contrast 

11 

.13 

.33 

2 

(.01)^ 

(.13)3 

Positive  Reinforcement 

2 

(1.74)^ 

(.01)b 

1 

(1.75)^ 

(.00)3 

Total 

644 

•^'287 

.61 

200 

.42 

.58 

Note.    Eta^  for  individual  effect  tjizes  =  .05;  for  median  effect  sizes  =  .11. 

3pewer  than  5  median  effect  sizes /  so  not  considered  interpretable. 
Fewer  than  10  effect  sizes/  so  not  considered  interpretable. 
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effect  on  results/  especially  if  clustered  within  one  or  a  few  analysis 
categories/  such  as  treatment  technique.  Such  effects  are  a  special  concern 
because  of  the  suspicion  that  unusually  large  effect  sizes  may  be  due  to 
study  characteristics — i.e./  threats  to  internal  validity — that  have 
in validly  magnified  outcomes. 

Definition 

There  are  no  clear  guidelines  for  defining  outliers — that  is,  for 
deciding  whc  t  is  an  extreme  score.  In  arriving  at  a  definition/  we  first 
inspected  a  frequency  distribution  for  our  644  JDs  to  determine  if  there  were 
clear  breaks  at  both  ends  of  the  distribution/  particularly  at  1/  2/  and  3 
standard  deviations  from  the  mean.  None  was  evident  at  one  standard 
deviation.  Moreover/  that  distance  from  the  mean  did  not  seem  adequately 
extreme  to  be  considered  the  dividing  point  for  outliers.  At  two  standard 
deviations/  there  was  a  clear  break  at  the  negative  end  of  the  distribution 
and  a  perceptible  break  at  the  positive  end.  Moreover/  in  a  normal 
distribution/  about  95  percent  of  the  distribution  would  fall  within  those 
points.  The  Ds  that  fell  beyond  two  standard  deviations  bel  w  the  mean 
constituted  ,7  percent  of  the  distribution;  4,3  percent  were  more  than  2 
standard  deviations  above  the  mean.  So,  despite  the  obvious  lack  of 
normality/  by  defining  "outlier"  as  a  £  more  than  2  standard  deviations  from 
the  mean/  5  percenJ'  of  the  effect  sizes  (N  =  33)  were  excluded  from  the 
analyses.    We  accep'      that  definition. 

Results 

As  can  be  seen  in  Table  64/  the  effects  of  dropping  the  33  outliers  were 
minor.    As  would  be  expected/  because  the  >  vvere  more  positive  than  negative 
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Table  64 

Results  for  Treatment  Techniques 
with  "Outlier"  Ds  ExclucJed 


Effect  Sizes  (Ds) 


All 


Ouuliers  Excluded 


Technique 

N 

Mean 

SD 

N 

Mean 

SD 

Persuasive  Message 

23 

.67 

.67 

21 

.54 

.34 

Infcrmation  Plus  Contact 

ICQ 

.51 

.66 

94 

.46 

.48 

Direct  Contact 

93 

.43 

.73 

87 

.28 

.36 

Vip;^r"ion^  Rynpr"ipnpp 

58 

-40 

.76 

53 

.36 

.54 

Other 

71 

.39 

.64 

65 

.30 

.47 

Systematic  Desensitization 

21 

.32 

.44 

21 

.32 

.44 

Infonnation 

203 

.29 

.51 

197 

.28 

.43 

Information  Plus  Vicarious 

62 

.20 

.36 

62 

.20 

.36 

Persuasive  Message,  Contrast 

11 

.13 

.33 

11 

.13 

.33 

Positive  Reinforcement 

2 

(1.74)^ 

(.01)^ 

Total 

644 

.37 

.61 

611 

.32 

.44 

^oo  few  effect  sii^s  (less  than  10)  to  be  interpretablt. 
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outliers,  the  overall  mean  D  dropped,  but  only  from  .37  to  .32.  As  should  be 
the  case  when  the  range  is  restricted,  the  impact  on  the  standard  deviation 
was  more  noticeable/  dropping  from  .61  to  .44.  Because  the  mean  D  for  the 
highest  ranked  technique,  Persuasive  Messages,  decreased  considerably — 
from  .67  to  .54 — the  overall  spread  among  means  for  the  treatment  groups  was 
reduced;  but  the  rankings  remained  basically  the  same,  except  for  Direct 
Contact*  That  mean  D  dropped  from  .43  to  .28,  moving  it  from  the  third  rank 
to  a  tie  for  the  sixth  rank.  There  also  was  a  minor  shift  in  the  order  of 
the  Other  and  Systematic  Desensitization  means. 

It  is  also  worth  noting  that  eliminating  outlier  Ds  <?'  '  reduce  the 
magnitude  of  differences  between  the  means  for  types  of  comparisons  (see 
Table  65).  Although  the  mean  £  for  each  comparison  type  was  reduced, 
deleting  outliers  had  the  greatest  effect  on  the  Pre-post  mean.  Even  t\at 
change  in  means  (.12),  however,  was  right  at  the  criterion  for  a  trivial 
difference.  The  range  for  the  three  comparison  types  dropped  from  .13 
to  .05.  This  negligible  result,  like  those  for  the  treatment  technique  mean 
£s,  suggests  that  considering  the  more  extreme  Ds  to  be  a  legitimate  part  of 
the  data  set  had  only  a  marginal  impact  on  the  findings. 

Investigation  of  the  reasons  for  outliers  could  be  revealing  in  terms  of 
methodological  and  treatment  considerations.  In  particular,  the  reduction  in 
the  mean  Ds  for  two  treatment  techniques,  Pert>uasive  Messages  and  Direct 
Contact,  bears  further  attention.  The  studies  from  which  the  outlier  Ds  came 
will  be  examined  in  a  later  phase  of  this  study  to  determine  if  there  were 
especially  serious  threats  to  internal  validity  or  other  aspects  of  treatment 
implementation  that  seem  to  explain  the  extreme  outcomes. 
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Tabl^  65 

Comparison  Type  Effect  Sizes 
With  Outlier  Ds  Eliminated 


Effect  Sizes  (Ds) 


All 


Outliers  Excluded 


N 

Mean 

SD 

N 

Mean 

SD 

Treatment 

versus  Control 

497 

.36 

.58 

477 

.32 

.42 

Treatment 

versus  Placebo 

49 

.29 

.70 

45 

.20 

.49 

Pre-post 

98 

.49 

.72 

89 

.37 

.50 

Total 

644 

.37 

.61 

611 

.32 

.44 
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Treatment  A.  vs.  B  Effect  Sizes 

As  noted  in  Chapter  4,  a  basic  analytic  strategy  adopted  for  this  review 
was  to  use  effect  sizes  based  on  the  comparison  of  a  treatment  against  the 
absence  of  treatment/  in  the  form  of  a  control  or  placebo  condition  or 
pretest  data.  Consequently/  effect  sizes  from  Treatment  A  versus  Treatment  B 
(A  vs.  B)  comparisons  were  excluded  from  our  main  analyses.  It  was  our  hope/ 
however/  that  a  large  number  of  Treatment  A  versus  Treatment  B  effect  sizes 
would  come  from  studies  in  which  different  attitude  modification  techniques/ 
rather  than  variations  of  the  same  technique/  were  compared.  Those 
comparisons  would  be  directly  relevant  to  our  main  ar  ^lyses.  For  example/ 
with  the  mean  £  for  Information  Plus  Contact  (.52)  only  slightly  higher  than 
that  for  Direct  Contact  (.43)  and  considerably  higher  than  that  for 
Information  (.29)/  it  would  be  of  interest  whether  direct  comparisons  of 
these  techruques  yielded  a  similar  ordering  of  outcomes? 

The  hope  for  such  analyses  turned  out  to  be  unrealizable.  Of  61  A  vs.  B 
Ds/  only  14  came  from  direct  comparisons  of  treatment  techniques — 12  (from  3 
studies)  were  for  Information  Plus  Contact  versus  Information  and  2  (from  one 
study)  were  for  Contact  versus  Information. 

Results 

Clearly/  given  the  small  number  of  £s  and/  even  more  important/  the 
small  number  of  studies/  uot  much  credence  can  be  put  in  summaries  of  those 
few  A  vs.  B  results.  However/  rather  than  to  simply  ignore  what  information 
is  there/  and  to  satisfy  the  curious  reader/  it  is  worth  noting  that  of  the 
12  Information  Plus  Contact  versus  Information  comparisons/  11  favored 
Information  Plus  Contact  and  1  did  not. 
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The  mean  D  for  these  12  comparisons  is  A6,  with  a  standard  deviation 
of  .33.  One  study  with  a  sample  of  university  students  in  an  introductory 
special  education  course  yielded  3  Ds,  all  positive*  (median  D  =  .44); 
another  with  a  sample  of  undergraduate  students  in  elementary  and  secondary 
education/  yielded  6  £s,  all  positive  (median  £  =  .56).  The  third  study 
involved  sixth/  ninth/  and  twelfth  graders;  and,  an  effect  size  was  computed 
for  each  grade  level.  The  Ds  for  the  sixth  and  ninth  graders  were  positive 
(.29  and  .46/  respectively)/  but  the  twelfth-grade  D_  was  negative  (-.52). 
The  author  of  that  study  (Mulkey/  1980)  suggested  that  the  film  used  for 
information  (A  Different  Approach)  may  have  required  too  much  sophistication 
of  the  sixth  and  ninth-graders/  as  contrasted  with  gaining  information 
through  contact  and  discussions  with  a  disabled  person  in  a  wheelchair. 

What  might  one  conclude  from  these  few  studies  and  Ds?  Not  much.  There 
is  some  confirmation  of  the  rank  ordering  of  Information  Plus  Contact  and 
Information  as  attitude  modification  techniques.  But  there  also  is  an 
indication  that  variations  in  treatments  can  be  importantf  as  illustrated  by 
the  use  of  a  film  apparently  too  subtle  for  younger  students/  but  more 
effective  than  Contact  Plus  Information  for  twelfth  graders.  But  with  the 
small  Ns/  extreme  caution  must  be  exercised  in  even  considering  these  results 
as  suggestive. 

Mainstreaming  Studies 
At  the  beginning  of  this  project/  it  was  thought  that  we  might  locate  a 
number  of  studies  of  the  effects  of  mainstreamed  classrooms  on  the  attitudes 


*D's  were  computed  so  that  a  positive  D  indicates  a  greater  mean  gain  for  the 
Information  Plus  Contact  group. 
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of  nondisabled  students  toward  persons  who  have  disabilities.  For  several 
reasons — the  likely  long-term  duration  of  mainstreaming  contact/  the 
likelihood  that  research  in  ongoing  school  settings  would  be  less  well- 
controlled/  and  because  specific  attributes  of  mainstreaming  might  be  of 
special  interest  (such  as  the  pre-mainstreaming  preparation  of  the  parents  of 
nondisabled  as  well  as  disabled  children) — it  was  decided  to  code  studies  of 
attitude  change  in  mainstreamed  classrooms  separately.  For  this  purpose/ 
mainstreaming  was  defined  (Conventions/  see  Appendix  C)  as: 

a  systematic/  sustained  effort  to  integrate  disabled  students  in  regular 
classrooms  for  part  or  all  of  their  instruction  [for  one  or  more  periods 
a  day]/  as  contrasted  with  bringing  disabled  students  into  a  regular 
classroom  temporarily  to  provide  contact  as  part  of  a  research  project, 
(pp.  C-2/CI  4/  C-21/Ci  9) 

Surprisingly/  we  fouTid  few  studies  that  fit  this  definition  and  in  which 
attitudes  toward  persons  with  disabilities  were  assessed  (as  contrasted/  for 
example/  with  attitudes  toward  mainstreaming).  Effect  sizes  from  studies 
involving  other  types  of  in-school  contact — such  as  presence  of  a  special 
classroom  in  a  school/  with  incidental  contact/  or  with  planned  contact  at 
lunch  and  on  the  playground/  through  visits  to  the  special  classroom/  or 
through  providing  special  assistance  to  the  children  with  disabilities — were 
coded  under  the  appropriate  attitude  modification  techniques  (Contact/ 
sometimes  in  combination  with  Information  or  Vicarious  Experience). 
Consequently/  our  analysis  of  mainstreaming  effects  turned  out  to  be  a 
supplementary/  rather  than  a  mainstream/  effort. 

Results 

Of  20  mainstreaming  effect  sizes  (from  9  studies)/  6  were  comparisons  of 
different  versions  of  mainstreaming  (what  we  have  termed  A  vs.  B  studies)/ 
leaving  only  14  effect  sizes  from  7  studies.    All  of  the  14  effect  sizes  came 
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from  Ss  who  were  junior  high  age  or  below — 6  from  the  intermediate  grades 
(grades  4-6)/  1  from  junior  high  school/  5  from  grades  K-6/  and  2  with 
samples  from  intermediate  through  junior  high  school  grades — with  too  few  Ds 
in  any  of  the  grade-age  level  categories  for  separate  analysis.  The  mean  Di 
was  ,19/  with  a  standard  deviation  of  .43  and  a  range  from  -.81  to  1.05, 
(The  statistics  for  median  Ds  were  similar/  although  indicating  somewhat 
greater  homogeneity:  mean  =  ,23;  standard  deviation  =  ,32;  range/  -.33 
to  .72.) 

With  the  small  numbers  of  effect  sizes  and  studies/  it  was  not  feasible 
to  break  down  the  data  to  analyze  the  relationships  of  coded  variations  in 
mains treaming  to  outcomes.  In  any  event/  such  analyses  would  not  have  been 
fruitful  bicause  of  the  lack  of  information  in  the  reports  for  coding  (Can't 
Tell  was  the  most  frequent  rating  for  Type  of  Instruction;  Minutes  Per  Day, 
Days  Per  Week/  and  Minutes  Per  Week  in  i^iainstreamed  Classroom)  and  tne  lack 
of  variation  when  information  was  available  for  coding  (None  was  the  most 
frequent  rating  for  Special  Personnel  Support  [N  =  12;  Can't  Tell  =  2], 
Special  Skills  Training  for  Disabled  Students  [N  =  12;  Can't  Tell  =  2]/  and 
Special  Instruction  for  Nondisabled  Peers  [N  =  8;  Can't  Tell  =  2;  Information 
=  4]).  About  the  only  variability  that  could  be  gleaned  was  that  five  of  the 
14  effect  sizes  came  from  planned  studies/  while  9  came  from  post  hoc 
studies;  and/  the  num'ber  of  months  the  Ss  had  been  in  mainst reamed  programs 
ranged  from  6  to  18. 

In  short/  the  studies  coded  in  the  Mainstreaming  category  added  little 
information/  and  there  are  too  few  data  to  warrant  further  analyses/  with  one 
exception.  The  mean  D  (.19)  for  this  particular  contact  situation 
(Mainstreaming)  is  lower  than  the  overall  means  (see  Table  60)  for  Direct 
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Contact  (.43)  and  Information  Plus  Contact  (.52).  It  is  also  lower  than  the 
Direct  Contact  and  Information  Plus  Contact  mean  D  for  Intermediate  Ss  (.81 
and  .59,  respectively)/  but  about  the  same  as  the  Direct  Contact  mean  D  for 
Combination  grade-age  level  Ss  (.20).  Whether  any  other  variables  in  our 
data  set  are  associated  with  this  variability  does  merit  exploration  as  a 
part  of  further  analyses  of  our  Contact  effect  sizes. 

Studies  with  Effect  Size  Information  Missing 
As  noted  in  Chapter  3/  commonly  in  meta-analytic  types  of  integrative 
reviews,  studies  are  discarded  if  effect  sizes  cannot  be  computed  directly  or 
estimated  from  the  inferential  statistics  that  are  reported.  We  coded  some 
information  for  such  studies  (see  Appendix  B  for  the  Coding  Instrument)  if, 
as  a  minimum,  the  statistical  significance  of  the  reoults  could  be 
determined.  Our  interest  was  to  determine  whether  such  studies  differed  in 
characteristics  or  appeared  to  produce  results  different  from  those  studies 
for  which  effect  size  data  could  be  obtained. 

The  number  of  results  from  reports  which  lacked  effect  size  information, 
bu^"  provided  information  on  statistical  signilficance  (called  after  this, 
"Information  Missing  studies"),  was  182.  Of  these,  15  were  Treatment  A 
versus  B  comparisons,  leaving  167  outcomes  from  treatment  versus  control, 
treatment  versus  placebo,  and  single-group,  pre-posttest  comparisons.  To 
explore  whether  the  Information  Missing  studies  differed  from  those  for  which 
effect  size  information  was  available,  our  total  population  of  studies, 
including  those  with  A  vs.  B  comparisons,  was  deemed  pertinent.  Of  course, 
only  frequencies  are  available  for  Information  Missing  studies.  Some 
pertinent  information  is  reported  in  Tables  66,  67,  and  68. 
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study  Quality 

One  might  suspect  that  "Information  Missing"  studies  would  be  of  lower 
quality.  The  General  Internal  Validity*  information  in  Table  66  indicates 
that  to  be  the  case  for  our  data  set,  although  not  dramatically  so.  A  higher 
percentage  of  effect  sizes  came  from  studies  with  Medium  validity  (34%  vs. 
24%)  an<?  a  lower  percentage  came  from  studies  with  Low  validity  (63%  vs. 
75%).  There  was  a  barely  perceptible  difference  (3%  vs.  1%)  at  the  High 
validity  level/  with  neither  effect  sizes  or  Missing  Information  results 
coming  from  many  studies  of  that  quality. 

The  data  in  Table  66  also  indicate  some  differences  in  Types  of 
Comparisons*  Information  Missing  results  were  less  likely  than  were  effect 
sizes  to  come  from  treatment  versus  control  (T  vs.  C)  comparisons  (58%  vs. 
71%)  and  somewhat  more  likely  to  come  from  single-group/  pre-posttest 
comparisons  (21%  vs.  14%).  Interestingly/  while  the  Pre-post  mean  D  was 
higher  than  the  T  vs.  C  or  T  vs.  P  means  (.49/  .36/  and  .29/  respectively: 
see  Table  48)/  for  the  No  Information  results  (see  Table  67)/  tha  Pre-post 
comparisons  yielded  a  lower  percentage  of  positive  results  then  the  T  vs.  C 
or  T  vs.  P  comparisons  (36%/  49%/  and  41%/  respectively)  and  a  higher 
percentage  of  negative  results  (36%  for  Pre-post/  against  10%  and  14%  for  T 
vs.  C  and  T  vs.  P  [with  N  only  3]/  respectively). 

There  were  also  two  notable  differences  in  Sample  Selection.  Sample 
selection  procedures  for  Information  Missing  results  were  less  likely  to  be 
identified  (19%  Can't  Tell  vs.  6%).  And/  Missing  Information  outcome  samples 
were  less  likely  to  be  volunteers  (24%  vs.  34%). 


*Treatment  Validity  and  Test  Validity  were  not  coded  for  Information  Missing 
studies. 
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Table  66 

Characteristics  of  "Information  Missing"  Outcome 
and  Effect  Size  Studies 


General  Internal  Validity 


Information 
Missing 
Outcomes 


Effect 
Sizes 


Level 

N 

% 

N 

% 

High 

2 

1 

21 

3 

Medium 

44 

24 

243 

34 

Low 

136 

75 

441 

63 

Total 

182 

ICQ 

705 

ICQ 

Type  of  Comparison 


Treatment 

Versus  Control 

106 

58 

499 

71 

• 

Treatment 

Versus  Placebo 

22 

12 

49 

7 

A  vs.  B 

15 

8 

60 

8 

Prepost 

39 

21 

97 

14 

Total 

182 

99 

705 

100 

Sample  Selection 

N 

% 

N 

% 

Can*t  Toil 

34 

19 

43 

6 

Random 

16 

9 

62 

9 

Volunteer 

44 

:4 

237 

34 

Intact  Group 

74 

41 

327- 

46 

Other 

14 

8 

36 

5 

Total 

182 

99 

705 

100 

Note.  Because  of  rounding/  percentages  do  not 
always  sum  to  100. 
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Table  67 

Direction  of  "Missing  Information"  Outcomes 
by  Comparison  Types 


T  vs. 

c 

T 

vs.  P 

Pre- 

-DOSt 

Total 

Direction  of  Results 

N 

% 

N 

% 

N 

% 

N 

% 

Positive 

52 

49 

9 

41 

14 

36 

75 

45 

Negative 

U 

10 

3 

14 

14 

36 

28 

17 

Can't  Tell 

43 

41 

10 

45 

11 

28 

64 

38 

Total 

]06 

100 

22 

100 

39 

100 

167 

100 
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Report  Types 

Not  surprisingly/  Missing  Information  results  were  less  likely  to  come 
from  dissertations  (24%)  than  vjre  effect  sizes  (51%)/  as  can  be  seen  in 
Table  68,  However/  even  given  the  constraint.SJ  on  manuscript  length  for 
journal  publications/  it  is  surprising  ("shocking"  is  really  a  more  apt  word) 
to  see  that  63  percent  of  the  Missing  Information  results  came  from  journal 
articles/  while  only  20  percent  of  the  effect  sizes  did.  Even  journal 
publication  restrictions  do  not  account  for  that  difference/  as  information 
such  as  means  and  standard  deviations  is  easily  reported  in  tables  that  take 
relatively  little  space.  The  occasional  article  discarded  from  our 
population  because  not  even  a  ^-ratio  or  F-ratio  was  reported  along  with 
level  of  statistical  significance  is  even  more  reprehensible.  Blame  might  be 
placed  on  the  authors  who  submit  such  reports  for  publication  or  the  editors 
who  accept  them.  As  "gatekeepers"  for  the  profession/  we  believe  that  the 
editors  (and  their  reviewers)  bear  special  responsibility. 

Contexts 

Differences  in  Contexts  (Table  69)  are  evident/  too.  Lower  percentages 
of  Missing  Information  outcome  studies  were  conducted  in  an  Elementary  and 
Secondary  Schooling  context  (26%  vs,  36%)  or  College-University  context  (30% 
vs,  50%)/  but  higher  percentages  were  conducted  in  Inservice  Education  (15% 
vs,  8%)  and  Work  (14%  vs,  1%)  contexts.  This  finding  is  particularly 
important  in  light  of  the  lower  mean  Ds  and  low  numbers  of  Ds  for  the  adult 
groups  in  our  main  data  analyses/  especially  when  broken  down  by  grade-age 
levels  (see  Table  60  and  that  part  of  Table  68), 

Treatment  Techniques  and  Results 

It  is  also  worth  noting  that  there  are  differences  in  the  percentages  of 
treatment  techniques  represented  in  the  two  populations/   as  revealed  in  the 
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Table  68 

Report  Types  for  "Information  Missing  Studies 
and  Effect  Size  Outcome 


Information 
Missing 
Outcomes 


Effect 
Sizes 


Type 

N 

% 

N 

% 

Journals 

115 

63 

141 

20 

Dissertations 

44 

24 

430 

61 

Theaes 

13 

2 

Convention  Papers 

4 

2 

13 

2 

Unpublished  Reports 

15 

8 

30 

4 

Conisinations 

4 

2 

78 

11 

Total 

182 

99a 

705 

100 

^Because  of  rounding/  percentages  did  not  sum  to 
100. 
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Table  69 


Study  Contexts  and  Grade-Age  Levels  foe 
"Information  Missing"  and  Effect  Size  Outcorres 


Informatioti 

Missing  Effect 
Outcome  Sizes 


Context 

N 

M 

a 

Can*t  Tell 

2 

1 

— 

— 

Elementary  and  Secondary  Schooling 

48 

26 

255 

36 

College-University 

55 

30 

352 

50 

Adult  Education 

10 

5 

3 

0.4 

Inservice  Education 

28 

15 

57 

8 

Work 

26 

14 

9 

1 

Community 

8 

1 

Recreation 

6 

3 

7 

1 

Other 

7 

4 

14 

2 

Total 

182 

98 

705 

99.4 

Grade-Age  Level 


Can't  Tell 

3 

2 

6 

1 

Preschool 

3 

0.4 

Primary 

9 

5 

30 

4 

Intermediate 

8 

4 

116 

16 

Middle  School 

20 

11 

18 

3 

Junior  High 

4 

2 

17 

2 

Senior  High 

iO 

5 

19 

3 

Cotrbination 

17 

9 

74 

10 

Undergraduate 

52 

29 

283 

40 

Graduate 

30 

4 

Postprofessional 

21 

11 

53 

7 

Adult  Not  in  Scnool 

30 

21 

50 

7 

Other 

5 

1 

Total 

182 

99 

705 

98.4 

Note.  Because  of  rounding/  percentages  do  not  always  sum  to 
100. 
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last  two  sets  of  columns  of  Table  70.  Versuasive  Messages  and  Systematic 
Desensitizacion  are  not  included  in  tte  Missing  Infor.nation  outcomes;  there 
are  lower  percentages  of  Other  and  Information  Plus  Vicarious  Information 
results,  but  higher  percentages  of  Information  Plus  Contact  and  Information 
results  represented  in  che  Missing  Information  outcomes. 

Also  of  interest  in  Table  70  are  the  percentages  of  Missing  Information 
outcomes  that  were  positive  for  each  treatment  technique.  Direct  Contact  has 
the  highest  percentage  (71%)/  with  Information  Plus  Contact  second  (44%)  and 
Information  third  (37%).  That  is  a  reversal  in  order  for  Contact  and 
Information  Plus  Contact  from  the  rankings  based  on  mean  Ds. 

Other  Results 

It  is  evident  that  the  population  of  studies  for  which  the  reports 
lacked  effect  size  computation  information  is  in  some  important  respects 
different  from  the  population  of  studies  for  which  such  information  was 
available.  A  potentially  important  area  of  difference,  already  touched  on 
immediately  above,  is  in  the  nature  of  resuxuS.  To  further  address  that 
topic,  frequencies  from  treatment  by  control,  treatment  by  placebo,  and 
single-group,  pre-po3ttest  comparisons  from  the  two  populations  were  compared 
in  terms  of  three  aspects  of  the  results  from  primary  research  reports  often 
deemed  important:  the  statistical  significance  of  findings  (important  in 
terms  of  discerning  any  information-provision  bias  associated  with  the 
presence  or  lack  of  statistical  significance),  the  direction  of  findings,  and 
conclusions  as  to  what  the  findings  indicate  about  treatment  effectiveness. 

As  can  be  seen  in  Table  71:  the  percentages  of  statistically  significant 
and  nonsignificant  (at  the  .05  level)  Missing  Information  results  and  effect 
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Table  70 

Positive  Results  for  Missing  Information  Outcomes/ 

Totals  of  Missing  Information  Outcomes / 
and  Numbers  of  Effect  Sizes  by  Treatment  Technique 


Rankings 

Positive 
Results 

iocai 

biiecc. 

Sizes 

on 
Mean  Ds 

Treatment 

N 

N 

% 

N 

% 

1 

Persuasive  Message 

4 

Information  Plus  Contact 

24 

44 

C/1 

d4 

100 

Id 

Direct  L-oncacc 

15 

71 

yj 

1/1 

14 

4 

Vicarious  Experience 

8 

57 

14 

8 

58 

9 

5 

Other 

1 

33 

3 

2 

71 

11 

6 

Systematic  Desensitization 

21 

3 

7 

Information 

26 

37 

71 

42 

203 

31 

8 

Information  Plus  Vicarious 

1 

1 

3 

2 

62 

10 

9 

Persuasive  Message/  Contrast 

11 

2 

Positive  Reinforcement 

1 

1 

2 

0.3 

Total 

75 

45b 

167 

100 

644 

99.3^ 

^Percentages  of  the  Total  Missing  Information  results  that  were  positive  for  each 
technique. 

^Not  a  summation  of  the  column/  but  the  percentage  of  Total  Missing  Information 

results  that  were  positive. 
^Because  of  rounding/  the  percentages  did  not  add  up  to  100. 
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Table  71 


Statistical  Significance  and  Direction  of  Difference 
for  "Information  Missing"  Outcomes  and  Effect  Sizes 


Statistical  Significance 


Information 

Missing  Effect 

Outcomes  Sizes 

Statistical    

Significance  at 

.05  Level                   N          %  N  ? 


Not  significant 

109 

65 

258 

55 

Significant 

58 

35 

213 

45 

Total  Available 

167 

100 

471 

100 

Note.  For  173  effect  sizes  (27%)f  information 
on  statistical  significance  was  not  available. 

Phi  coefficient  =  .C9» 


Direction  of  Difference 


Information 
Missing 
Outcomes 

Effect 
Sizes 

Direction 

N 

% 

N  % 

Negative 

28 

27 

150  23 

Positive 

75 

73 

494  77 

Total 

103 

100 

644  100 

Note.  For  64  M issing  Information 
outcomes  (38%)/  all  of  which  were 

statistically  nonsignificant/   the  ^ 

direction  of  difference  could  not  be 

determined. 


Phi  coefficient  =  .03. 
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sizes*  differed  slightly.  While  65  percent  of  the  Information  Missing 
outcomes  were  not  significant  at  the  .05  level/  55  percent  of  the  effect  size 
results  were  not;  and,  while  35  percent  of  the  Information  Missing  outcomes 
were  statistically  significant/  45%  of  the  effect  size  findings  were.  The 
Phi  coefficient  of  .09  confirms  the  small  trend  toward  missing  information 
when  statistical  significance  was  not  attained.  The  relationship  is  even 
smaller  (Phi  =  .03)  for  direction  of  difference/  despite  a  slight  tendency 
for  Information  Missing  Outcomes  to  reflect  more  negative  (27%  vs.  23%)  and 
fewer  positive  (73%  vs.  77%)  results. 

There  are  some  interesting/  although  slight/  differences  in  the 
conclusions  about  treatment  effectiveness  drawn  by  the  authors  for  Missing 
Information  outcomes  and  effect  sizes  (see  Table  72).  For  example/  16 
percent  of  the  Missing  Information  authors  drew  no  conclusions  at  all,  as 
contrasted  with  6  percent  of  the  effect  size  authors.  Moreover/  only  33 
percent  of  the  Missing  Information  authors  concluded  that  their  treatment  had 
no  effect/  but  10  percent  indicated  their  results  were  equivocal/  as 
contrasted  with  40  and  6  percent  for  the  effect  size  authors.  Also/  only  36 
percent  of  the  Information  Missing  authors  were  coded  as  drawing  the 
conclusion  that  their  treatment  had  a  positive  effect/  while  44  percent  of 
the  effect  size  authors  did  so.  That  the  differences  are  not  dramatic 
overall  is  indicated  by  a  Cramer's  V  of  .16. 


*Information  on  statistical  significance  was  not  available  for  173  effect 
sizes.  Most  of  thtoe  were  effect  sizes  that* were  not  tested  for  statistical 
significance.  For  example/  a  researcher  who  reported  a  statistically 
nonsignificant  F-ratio  for  three  treatment  means  and  a  control  mean  would 
typically  not  report  the  inferential  results  of  pair-wise  comparisons;  we 
did/  however/  compute  T  vs.  C  effect  sizes. 
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Table  72 

Authors*  Conclusions  About  Treatment  Effectiveness 
for  "Information  Missing"  Outcomes  and  Effect  Sizes 


Information 
Missing 
Outcomes 


Effect 
Sizes 


Conclusion 

N 

% 

N 

% 

Ho  conclusiOT 

26 

16 

42 

6 

No  effect 

56 

33 

258 

40 

Equivocal  results 

17 

10 

40 

6 

Positive  effect 

61 

36 

284 

44 

Negative  effect 

7 

4 

20 

3 

Total 

167 

99 

644 

99 

Not(:;,  Because  of  rounding/  percentages  do  not 
sum  to  100. 


Cramer's  V  =  .16. 
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Differences  in  the  results  of  studies  from  reports  with  and  without 
information  for  computing  effect  sizes  appear  to  be  small.  But  it  must  be 
remembered  that  there  are  differences  in  the  studie.s  in  the  two  populations/ 
including  the  treatments  that  are  represented  in  the  data  sets. 

Variance  Ratios 

The  focus  of  data  analysis  in  most  educational  and  psychological 
experimental  research  has  been  on  the  ct^ntral  tendencies  (usually  as 
indicated  by  the  means)  of  the  groups  under  study.  Although  the  variability 
in  scores  is  a  key  element  in  the  statistical  tests  used  to  examine  mean 
differences,  little  attention  is  given  to  differences  in  dispersion,  per  se. 
Reviewers  of  research,  too,  have  largely  assumed  that  the  important  question 
is  whether  there  were  mean  differences  at  treatment  end,  ignoring  questions 
of  whether  variability  increased  or  decreased,  or  which  result  would  be 
desirable.  Moreover,  whether  as  a  result  of  the  tendency  of  researchers  not 
to  ask  questions  about  the  effects  of  treatments  on  variability,  or  as  a 
cause  of  that  tendency,  the  issues  involved  in  the  analysis  of  changes  in 
variability  have  not  been  addressed  adequately. 

Analyzing  Variances 

The  variance  and  standard  deviation  are  conventionally  used  as  measures 
of  variability.  Differences  between  or  among  independent  variances  are 
occasionally  tested  for  statistical  significance  (frequently  to  test  for 
homogeneity  prior  to  testing  mean  differences  for  statistical  significance), 
and  a  standard  error  is  available  for  testing  the  difference  between  two 
correlated  standa*-d  deviations.  But  nothing  equivalent  to  the  repeated 
measures  analysis  of  variance  or  analysis  of  covariance  for  means  is 
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available  for  standard  deviations  or  variances  in  the  educational  and 
psychological  applied  research  literature. 

The  issue  of  an  effect  size  for  standard  deviations  or  variances  has 
also  not  been  treated  in  the  literature/  as  effect  sizes  for  means  have  been. 
As  we  considered  possible  effect  sizes,  it  seemed  clear  that  differences 
between  pretest  and  posttest  variances  for  a  treatment  group  and  control 
group  could  not  be  treated  as  indices  of  change  to  be  subtracted  and 
standardized  as  mean  gains  are  to  obtain  Ds.  Also,  the  comparison  of  pretest 
and  posttest  variance  ratios  by  subtracting  one  from  the  other  or  by  forming 
a  new  ratio  (e.g.,  the  pre-treatment  variance  ratio  divided  by  the  posttest 
variance  ratio)  made  little  sense. 

It  was  decided,  for  exploratory  purposes,  to  disregard  pretreatment 
differences  in  variability  and  obtain  an  estimate  of  difference  in 
posttreatment  variability.  That  is,  the  posttest  variance  for  the  treatment 
group  was  divided  by  the  posttest  variance  for  the  control  or  placebo  group, 
or  the  pretest  variance  in  a  single-group  comparison,  to  obtain  a  variance 
ratio.  A  ratio  greater  than  one  indicates  greater  posttreatment  variability 
for  the  treatment  group;  a  ratio  less  than  one  indicates  the  reverse — less 
posttest  variability  for  the  treatment  group.  Of  course,  a  ratio  of  1 
indicates  equal  variability. 

Overall  Results 

For  the  644  effect  sizes  used  in  the  main  analyses  reported  in  Chapter 
5,  the  posttest  standard  deviations  or  variances  necessary  to  compute 
variance  ratios  were  available  ior  453,  or  70  percent.  The  mean  variance 
ratio  was  1.13.  That  is,  overall,  experimental  groups  were,  at  posttest, 
slightly  more  variable  than  their  untreated  comparisons,   although  the 


O  278 

ERLC  303 


difference  from  1  (the  ratio  for  equal  variances)  is  less  than  the  criterion 
of  .18  for  triviality  of  a  difference  between  variance  ratios/  as  defined  in 
Chapter  3.  The  standard  deviation  of  .91  represents  considerable  dispersion 
in  variance  ratios. 

Evaluating  Variance  Ratios 

The  evaluation  of  changes  in  mean  scores  on  measures  of  attitudes  toward 
persons  with  disabilities  is  usually  straightforward:  The  question  is 
whether  the  treatment  group  has  a  higher  posttest  mean  (or  lower/  if  that 
indicates  more  positive  attitudes)  than  is  present  in  '•.he  r.ontreatment 
compariiiion  data/  especially  if  any  change  in  the  treatment  group's  mean  from 
pretre-    .lent  to  posttreatment  assessment  has  been  in  a  positive  direction. 

W  th  variances/  evaluation  of  changes  is  not  so  simple.  Do  we  want  to 
increase  or  decrease  variability  in  attitudes?  That  depends/  and  in  many 
situations  an  answer  is  difficult  to  come  by.  For  example/  if  the  treatment 
group's  mean  attitude  score  is  initially  low  and  there  is  little  movement  in 
a  positive  direction/  increased  variability  (i.e./  a  higher  variance  ratio) 
might  indicate  more  positive  scores  by  those  who  already  had  high  scores — 
which  is  good/  unless  accompanied  by  less  positive  scores  by  those  already 
low.  On  the  other  hand/  a  decrease  in  variability  (i.e./  a  smaller  variance 
ratio)  might  reflect  upward  movement  by  those  at  the  less  favorable  end  of 
the  attitude  scale — which  would  be  good/  unless  the  reduced  variance  also 
indicated  that  those  with  more  positive  pretreatment  attitudes  developed 
less  favorable  attitudes  during  treatment. 

Probably  the  most  unambiguous  situation  would  be  on^  in  which  the  mean 
attitude  score  for  the  treatment  group  became  more  positive  and  variability 
was  reduced  (the  variance  ratio  became  less).    That  is,  in  such  a  valued 
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situation/  the  central  tendency  of  the  treatment  group's  attitude  scores/  as 
compared  to  the  comparison  data/  would  become  more  positive/  but  the 
dispersion  of  treatment  group  scores  would  be  reduced.  The  reduced  dispersion 
would/  hopefully/  indicate  that  Ss  who  had  lower  pretreatment  scores  had 
moved  closer  to  the  mean/  which  had  shifted  "upward"/  while  those  S3  with 
initially  high  scores  had  not  regressed  more  than  expected/  although  a 
ceiling  effect  might  have  precluded  higher  attitude  scores.  In  terms  of  our 
data  analyses/  one  would  look  for  positive  Ds  accompanied  by  variance  ratios 
of  less  than  1. 

D-V'.riance  Ratio  Relationships 

What  were  the  relations  betV'-.-»n  Ds  and  variance  ratios  for  our  644 
effect  sizes  from  treatment  versus  con^rol/  treatment  versus  placebo/  and 
single-group/  pre-posttest  comparisons?  As  reported  earlier/  the  mean  D 
was  .37/  dP  indicator  of  positive  change  (remembering  that  underlying  the 
mean  is  considerable  hetereogeneity  in  individual  Ds  [SD  =  .61]/  as  well  as 
negative  treatment  group  changes  and  negative  £s).  However/  the  mean  variance 
ratio/  as  notecl  above/  was  1.13/  reflecting  a  slightly  greater  variance  for 
treatment  gcoups/  on  the  average  (again/  with  considerable  heterogeneity  in 
individual  variance  ratios  [SD  =  .91]). 

How  about  the  results  for  different  treatment  techniques?  As  Table  73 
shows/  the  mean  variance  ratios  for  three  techniques*  are  basically  1/ 
indicating  equal  variances.  Two  of  these  are  slightly  below  1  (Information 
Plus  Contact/  .91;  Information  Plus  Vicarious  Experience  (.96)/  and  one  is 


♦Positive  Reinforcement  is  ignored/  because  the  N  is  only  2/  along  with 
Persuasive  Message/  Contrast/  because  10  of  li  effect  sizes  are  from  the 
same  study. 
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Table  73 

Ds  and  Variance  Ratios  for  Attitude  Modification  Techniques 


Rank 

Technique 

Effect  Sizes 

Rank 

Variance  Ratios 

N 

Mean 

SD 

N 

Mean 

SD 

1 

Persuasive  Messaae 

23 

.67 

±o 

1.3'o 

.84 

2 

Information  Plus  Contact 

100 

.51 

7 

70 
l\j 

.91 

.57 

3 

Direct  Contact 

93 

.43 

.  /  o 

o 
o 

.86 

.40 

4 

Vicarious  Exneripnrp 

58 

.40 

o 

/Id 

1.37 

1.13 

5 

other 

71 

.39 

.54 

1 

65 

1.43 

1.30 

6 

Systematic  Desensitization 

21 

.32 

.44 

5 

15 

1.04 

.41 

1 
I 

Information 

203 

.29 

.51 

4 

133 

1.13 

.93 

8 

Information  Plus  Vicarious 

62 

.20 

.36 

6 

31 

.96 

.52 

9 

Persuasive  Message/  Contrast 

11 

.13^ 

.33 

11 

.94^ 

.37^ 

Positive  Reinforcement 

2 

(1.74)"^ 

(.oi)t> 

2 

(4.13)"^ 

(.ll)b 

Total 

644 

.37 

.61 

453 

1.13 

.91 

Note.  The  Eta"^  for  £s  =  .05;  for  variance  ratios/  Eta^  =  .10. 
?Ten  of  11  Ds  came  from  one  study. 

^oo  few  efTect  sizes  (less  than  10)  to  be  interpretable. 
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slightly  above  (Systematic  Desensitization/  1.04).  Two  mean  variance  ratios 
are  slightly  greater/  but  still  small /  departures  from  the  "no-difference" 
value  of  1.  One  of  these  is  less  than  1  (Direct  Contact,  .86)  and  one  is 
greater  than  1  (Information/  1.13).  Three  variance  ratios  are  sufficiently 
larger  than  1  to  indicate  a  difference  of  small  co  moderate  magnitude: 
Persuasive  Messages  (1.36)/   Vicarious  Experiences  (1.34)/   and  Other  (1.43). 

If  anything/  then,  a  slight  tendency  toward  greater  posttest  variability 
in  treatment  groups  accompanies  the  positive  mean  Ds:  In  particular/  a 
pattern  of  higher  mean  Ds  associated  with  lower  variance  ratios  is  not 
evident  in  the  table.  In  fact/  instead  of  a  negative  relationship/  there  is 
a  low  positive  correlation  (r  =  .29;  r^  =  .08)  between  the  first  eight  mean 
Ds  and  variance  ratios. 

To  further  investigate  D^-variance  ratio  relationships/  correlation 
coefficients  were  computed  for  individual  effect  sizes.  The  coefficient  for 
the  453  individual  Ds  and  the  associated  variance  ratios/  as  presented  in 
Table  74,  is  -.03/  considerably  different  from  the  coefficient  of  .29  for  the 
mean  Ds  and  variance  ratios.  Moreover/  the  correlation  coefficients  for 
individual  treatments  reflect  a  striking  amount  of  diversity  in  magnitude  and 
direction  of  relationship.  Three  of  the  coefficients  are  zero  or  near  zero 
(Persuasive  Messages/  r  =  -.08;  Direct  Contact/  r  =  -.03;  Information/  r 
=  .00)/  indicating  relationships  between  Ds  and  variance  ratios  that  are 
basically  random*.  Two  coefficients  are  positive/  with  one  small  (Vicarious 
Experience/  r  =  *24)  and  the  other  moderate  (Systematic  Desensitization/  r 


*Scatter-diagrams  were  inspected  for  curvilinearity  and  to  determine  if 
outliers  had  undue  effects  on  the  coefficients.  In  no  case  was  there 
obvious  curvilinearity/  and  outliers  fit  the  relationship  expressed  by  the 
r. 
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Table  74 

Correlations  Between  Ds  and  Variance  Ratios 


Median / 
Outliers 

Individual  Median  Excluded 


Technique 

N 

r 

d 

r 

N 

R 

# 

Persuasive  Message 

18 

-.08 

5 

(-.88)9 

Information  Plus  Contact 

70 

-.40 

28 

-.40 

27 

-.39 

Direct  Contact 

64 

-  40"^ 

Vicarious  Experience 

44 

.24 

13 

.38 

12 

.11 

Other 

65 

-.38 

19 

.00 

18 

-.34 

S/stematic  Desensitization 

15 

.50 

1 

• 

Information 

133 

.00 

'XA 
J** 

Information  Plus  Vicarious 

31 

.33 

10 

.37 

9 

(-.04)3 

Persuasive  Message,  Contrast 

11 

-.05 

2 

• 

Positive  Reinforcement 

2 

1 

Total 

453 

-.03 

136 

.17 

135^ 

.08 

m 

^oo  few  data  points  to  be  interpretable. 
^o  outliers, 

^This  is  the  N  for  a  coefficient  with  one 

outlier 

deleted/ 

not  the 

total  for 

the  separate  outlier-excluded  analyses. 
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=  .50)/  indicating  that  as  treatment  outcomes  increased,  so  did  variability. 
And/  two  coefficients  are  negative  and  moderate  (Information  Plus  Contact/  r 
=  «.40;  Other/  r  =  -.38)/  indicating  that  as  outcomes  became  more  positive 
relative  to  comparison  groups/  variability  decreased  rel  ative  to  the  same 
comparison  data — our  postulated  valued  relationship. 

Following  up  on  previously  reported  exploratory*  analvses/  it  was  decided 
to  correlate  median  JDs  with  median  variance  ratios  to  determine  what/  if  any/ 
changes  would  occur  when  analyzing  by  study  rather  than  individual  effect 
size  (see  Table  74).  There  was  little  or  no  change  Zor  three  treatment*" 
(Information  Plus  Contact/  -.40  vs.  -.40?  Vicarious  Experience/  .24  vs.  .38; 
and/  Information  Plus  Vicarious  Experience/  .33  vs.  .37).  For  two 
treatments/  the  coefficients  shifted  in  a  positive  direction  (Other/  -.38 
vs.  .00;  and  Information/  .00  vs.  .30).  And/  for  one  treatment/  the  shift 
was  toward  greater  magnitude  fov  a  negative  coefficient  (Direct  Contact/  -.03 
vs.  -.40). 

Inspection  of  the  scatter-diagrams  indicated/  however/  that/  contrary  to 
what  was  the  case  with  the  analyses  for  individual  effect  sizes/  one  or  two 
outliers  ran  counter  to  the  trend  in  some  of  the  bivariate  distributions 
(none  of  which  appeared  to  be  curvilinear).  To  check  on  the  possible 
influences/  correlations  were  rerun  with  obviously  contradictory  outliers 
excluded.  In  no  case  was  more  than  one  outlier  deleted;  with  Direct  Contact/ 
none  was  eliminated.  The  coefficient  f^r  Information  Plus  Contact  rerriained 
amazingly  constant  across  all  three  analyses  (-.40/  -.40/  -.39)/  while  those 
for  Vicarious  Experience  (^38  vs.  .11)/  Other  (.00  vs.  -.34)/  and  Information 
(.30  vs.  -.23)  changed  substantially/  with  the  Other  coefficient  nearly 
identical  to  the  coefficient  for  the  individual  effect  size  analysis  (-.34 
vs.  -.38), 
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All  in  all/  the  picture  is  not  particularly  clear/  except  for 
Information  Plus  Contact,  All  analyses  indicated  the  postulated  desirable 
tendency  for  higher  Ds  for  that  treatment  to  be  associated  with  lower 
variances.  Direct  Contact  and  Other  (combinations  of  techniques)  showed 
strong  tendencies  in  that  direction/  with  Information  less  strongly  so.  At 
the  same  time/  *he  coefficients  for  Vicarious  Experience  were  consistently 
positive/  even  if  small  (r  =  ,11)/  indicating  a  tendency  for  larger  variances 
to  be  associated  with  higher  median  Ds, 

The  differing  directions  and  the  range  of  magnitude  of  the  correlations 
between  and  variance  ratios  present  provocative  questions  about  the 
relative  effects  of  different  treatments  or,  as  has  been  noted  in  Chapter  5/ 
about  the  differences  in  samples  (e,g,/  grade-age  levels)  with  which 
different  treatments  have  been  mainly  investigated.  Without  data  in  which 
pretreatment  differences  are  held  constant/  further  analysis  did  not  seem 
worthwhile.  But/  at  the  very  least/  the  results  suggest  that  the  effects  o£ 
treatments  on  variability  in  attitudes  toward  disabled  persons  are  deserving 
of  greater  attention  by  primary  researchers  and  reviewers. 

Summary 

In  this  chapter/  we  have  presented  a  potpourri  of  exploratory  and 
supplementary  analyses.  We  reported  finding  little  suggestion  that  the 
renults  of  analyses  with  the  644  Ds  from  treatment  versus  control/  treatment 
versus  placebo/  and  single-group/  pre-posttest  comparisons  would  have  been 
different  had  we  first  eliminated  outliers  (Ds  more  than  2  standard 
deviations  above  or  below  the  mean)/  or  if  w'e  had  used  a  median  D  for  ecch 
study/  rather  than  analyzing  Ds  for  individual  findings.  We  did  rap^/rt  some 
evidence  that  studies  for  which  the  reports  contained  insufficient 
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information  for  computing  Ds  seemed  to  be  in  other  ways  as  well  a  different 
population  than  the  studies  for  which  that  information  was  available.  We 
found  little  intormation  to  share  from  the  Mainstreaming  studies  and  the 
Treatment  A  versus  Treatment  B  comparisons  that  were  coded.  Finally/  our 
brief  treatment  of  the  ratios  of  posttest  treatment  and  comparison  variances 
suggested  that  variability  is  a  topic  that  is  more  deserving  of  attention  by 
statisticians/  primary  researchers/  and  reviewers.  In  any  event/  che  main 
foundation  for  any  conclusions  to  be  drawn  from  this  quantitative  integrative 
review  remains  the  analyses  of  Ds  reported  in  Chapter  5. 
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CHAPTER  7 

WHAT  HAS  BEEN  LEAt^^ED? 

As  noted  in  Chapters  1  and  2/  prior  reviews  have  not  been  based  on 
comprehensive  collections  of  research  reports  or  on  the  extensive/  systematic 
collection  and  analysis  of  quantitative  data  on  study  outcomes  and  study 
characteristics.  An  assumption  underlying  this  review  was  that  the  inability 
of  prior  reviewers  to  draw  firm  conclusions  about  the  effectiveness  of 
attitude  modification  techniques  was  likely  due,  at  least  in  part/  to  the 
small  samples  of  prior  studies  that  were  reviewed  and  the  lack  of  systematic 
data  collection  and  analysis.  A  meta-analytic  type  of  integrative  review  of 
the  research  on  modifying  attitudes  toward  persons  with  disabilities  was 
proposed  and  initiated  with  the  hope  of  bringing  order  to  the  literature 
where  other  reviews  had  not  done  30.  As  the  reader  of  Chapters  5  and  6 
knows/  that  hope  turned  out  to  be  in  vain. 

Treatment  Efficacy 

Even  with  a  population  of  studies  based  on  an  exhaustive  search  of  the 
literature  and  with  quantitative  integrative  review  techniques/  clear-cut 
indications  were  not  found  ot  the  overall  efficacy  of  techniques  for 
modifying  attitudes  toward  disabled  p^^rsons  or  of  reliable  differences  in 
efficacy  between  techniqur/s.  The  mean  D  for  our  644  treatmei^t  versus 
control/  treatment  versus  placebo/  and  single-group/  pre-posttest  comparisons 
was  .37  (for  a  positively  skewed  distribution).  This  is  a  moderate/  not 
particularly  large/  effect  size.  It,  as  well  as  the  standard  deviation 
of  .61/  reflects  the  instances  in  our  data  set  of  negative  changes  by 
treatment  groups  (12%)  and  negative  effect  sizes  (23%)/  which  indicate  that 
at  treatment  end/  the  comparison  group  had  the  higher  mean. 
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We  were  able  to  rank  order  the  mean  Ds  for  treatment  techniques  from  .67 
for  Persuasive  Messages  to  .20  for  Information  Plus  Vicarious  Experiences. 
There  were,  however,  negative  effect  sizes  for  each  treatment.  Moreover,  the 
variances  indicated  considerable  overlap  between  the  distributions  of  Ds  for 
the  various  treatment  techniques. 

ConCv,7iitant  Variables 

A  search  for  concomitant  variables  which  might  explain  ot  help  to  make 
sense  out  of  those  results  was  not  particularly  fruitful*  Surprisingly/ 
"quality  of  study"  indicators  were  not  related  to  outcomes,  a  matter  to  which 
we  will  return  later  in  this  chapter. 

Treatment  variation.  There  was  a  great  deal  of  variation  in  treatments 
categorized  under  similar  labels,  such  as  Information  and  Direct  Contact. 
For  the  most  part,  the  proportion  of  variance  in  outcomes  associated  with 
these  variations  was  low  (.07  or  less),  although  type  of  experience  was 
associated  with  20%  of  the  variance  in  £s  for  Vicarious  Experience  studies/ 
and  type  of  message  presentation  was  associated  with  28%  of  the  variance  in 
Persuasive  Message  Ds.  There  were  some  apparent  differential  effects. 
However,  nesting  of  treatments  within  the  types  of  disabilities  toward  which 
attitude  change  efforts  were  directed  and  cells  that  were  empty,  or  nearly 
so,  precluded  conclusions  about  interactions. 

Other  St  .Jy  characteristics.  Studies  also  differed  in  a  variety  of 
other  ways/  including  the  length  of  treatment  and  time  of  posttest,  the  type 
of  dependent  measures,  the  contexts  and  settings  within  which  they  were 
carried  out,  and  sample  size.  These  variations  explained  very  little  of  the 
variance  in  Ds,  with  no  r^  or  Eta^  greater  than  .05.  In  most  cases/  the 
majority  of  effect  sizes  fell  into  one  or  two  characteristic  categories.  For 
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most  of  the  variations/  any  relations  with  Ds  were  consistent  across 
treatments.  Ono  exception  was  length  of  treatment.  The  overall  r  for  length 
of  treatment  and  Ds  was  .02/  but  there  were  some  differences  in  coefficients 
within  treatment  categories — with  low  negative  coefficients  for  Information/ 
Information  Plus  Contact/  and  Systematic  Desensitization/  and  a  moderate 
positive  r  for  Systematic  Desensitization.  Another  exception  was  context. 
The  predominant  contexts  were  Eiementary-Secondary  Schooling  and  College- 
University/  with  some  nesting  of  treatments  within  contexts  (e.g./  no 
Persuasive  Messages  or  Systematic  Desensitization  effect  sizes  from  the 
Elementary-Secondary  context)  and  some  different  results  (e.g./  a  higher 
Direct  Contact  £  in  the  Elementary-Seconc^ary  context,  with  a  reversal  for 
Vicarious  Experiences).  Again/  nesting  and  empty  or  low  N  cells  make 
difficult  any  conclusions  about  the  association  of  treatment  outcomes  with 
other  study  characteristics. 

Sample  characteristics.  Variations  in  sample  characteristics  also 
accounted  for  little  of  *  he  variance  in  £s/  with  no  Eta^  or  larger 
than  .04  for  method  of  sample  selection/  grade-age  level/  or  gender.  (The 
relations  of  prior  contact  and  personality  variables  to  outcomes  could  not  be 
analyzed  because  they  were  basically  ignored  in  the  primary  research 
reports.)  There  were  some  differential  effects  for  samples  selected  by 
different  methods/  especially  volunteers;  but  they  were  confounded  with 
context  ( volunteers  were  more  likely  to  come  from  college-university 
studies).  There  also  appeared  to  be  treatment  effect  size  diffe-  '^nces  by 
grade-age  levels/  but  with  nesting  and  small  Ns  or  empty  cells/  thc4w  could 
not  be  discerned  with  certainty.  Gender/  too,  seemed  to  be  related  to 
different  resLilts  across  treatments/  but  the  relationships  were  moderate  and 
inconsistent. 
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Summary^  As  a  consequence  of  the  unevenly  distributed  variations,  with 
many  cells  empty  or  with  low  Ns,  and  the  nesting  of  treatments,  the  analysis 
of  potential  concomitant  ^'ariables  was  not  particularly  productive,  except 
for  indicating  areas  to  be  addressed  in  future  research.  Rather  than  drawing 
conclasions  about  the  conditions  under  which  different  attitude  modification 
techniques  had  been  more  or  less  successful,  th^  major  conclusion  had  to  be 
that  there  had  been  a  great  deal  of  variety  in  the  conditions  under  which  the 
effectiveness  of  the  various  attitude  modification  techniques  was 
investigated,  that  the  variations  have  not  been  systematically  controlled, 
and  that,  for  that  reason,  they  confounded  efforts  to  draw  conductions  about 
treat  men t  ef  f ec t i  veness. 

Summation 

All  possible  data  analyses  could  not  be  conducted  within  the  time  span 
of  the  funded  project  for  which  this  report  has  been  prepared,  and  further 
analyses  will  be  conducted  for  other  reports  to  groups  of  professionals.* 
However,  at  this  time^  the  status  of  the  research  field  might  best  be 
summarized  with  the  flavor  of  the  quote  from  Towner  (1984)  which  we  used  in 
Chapter  1  to  indicate  that  another  review  of  the  literature  was  warranted: 


*Paperc  are  now  scheduled  for  presentation  at  annual  meetings  of  the  Council 
on  Exceptional  Children  in  Chicago  on  April  22,  1987,  and  the  American 
Educational  Research  Association  in  Washington,  D.  C  on  April  24,  1987,  and 
at  the  Third  Annual  Canadian  Congress  of  Rehabilitation  sponsored  by  the 
Canadian  ^Rehabilitation  Council  for  the  Disabled  in  Quebec  on  June  2,  1987. 
Papers  dealinq  with  specific  factors  in  attitude  change,  such  as  the  target 
attitudes,  the  grade-age  level  of  subjects,  and  individual  modification 
techniques  will  be  developed  for  presentation  at  the  meetings  of  other 
professional  organizations.  For  example,  a  presentation  on  implications  for 
social  studies  curricula  has  been  proposed  for  the  annual  meeting  of  the 
National  Council  for  the  Social  Studies  in  Dallas  in  November,  1987. 
Several  manuscripts  for  journal  articles,  including  one  in  which  the  prior 
reviews  are  analyzed,  are  under  preparation,  as  well. 
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The  applications  [of  similar  techniques]  yielded  discouraging  and 
contradictory  findings.  Both  positive  and  negative  attitudinal  changes/ 
in  addition  to  numerous  reports  of  [statistically!  nonsignificant 
changes/  resulted  from  interactions  [or  nondisabled  persons]  with 
disabled  persons  as  well  as  from  the  provision  of  educational  and 
general  information,    (n.  249) 

The  results  of  this  review  are  likely  to  be  disappointing  not  only  for 

persons  seeking  guidelines  for  attitude  modificaticxi  programs/  but  for  those 

interested  in  the  implications  of  this  body  of  research  for  the  validity  of 

attitude  modification  theory.    Very  few  of  the  investigators  based  their 

treatments  explicitly  upon  any  theory.  Consequently/  little  worthwhile 

information  was  obtained. 

Quality  of  Research 

Earlier  references  to  the  lack  of  systematic  variations  in  treatment  and 
other  study  characteristics  have  more  than  incidental  importance.  If  the 
comprehensive  quantitative  review  of  literature  reported  in  prior  chapters 
did  not  lead  to  firm  conclusions  about  the  efficacy  of  various  attitude 
modification  techniques  for  changing  attitudes  toward  persons  with 
disabilities/  it  did  yield  some  more  definite  conclusions  in  regard  to  the 
quality  of  the  research  in  this  area. 

As  noted  in  Chapter  5  and  mentioned  above/  research  quality  indicators 
were  not  related  to  study  outcomes  in  our  data  set.  That  result  may  have 
been  due  in  large  part  to  the  few  studies  which  were  coded  as  being  of  high 
quality.  One  coding  difficulty  lay  with  reporting — that  is,  the  frequent 
lack  of  sufficient  information  in  reports  to  make  methodological 
categorizations.  Whether  due  to  a  predilection  on  the  part  of  authors  not  to 
give  adequate  detail53/  or  to  the  pressures/  real  or  imagined/  to  make 
research  reports  brief/  particularly  for  journals/  important  informatiai  in 
regard  to  procedures  and  threats  to  internal  validity  was  omitced. 
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Beyond  that  coding  constraint/  however/  many  methodological  weaknesses 
were  obvious  in  the  studies  that  made  up  our  population.  Included  were/  for 
example/  failures  to  verify  implementation  of  the  independent  variable/ 
neglecting  to  report  reliability  coefficients/  much  less  to  compute  them  for 
the  scores  used  in  the  investigation/  the  tendency  to  use  only  Likert^type 
questionnaires  to  assess  attitudes/  failures  to  incorporate  blinded  test 
administration  in  the  procedures/  and  reliance  upon  statistical  significance 
as  an  indicator  of  the  importance  of  outcomes. 

Perhaps  most  important  of  all,  replications  of  studies  were,  for  all 
intents  and  purposes/  missing  from  the  population  of  reports.  Moreover/  the 
welter  of  study  characteristics  and  outcomes  belie  the  hope  of  some  persons 
that  meta-analytic  reviews  of  the  literature  will  be  an  adequate  substitute 
for  the  careful  planning  of  replications  as  an  element  of  programmatic 
research. 

This  "finding"— that  is,  that  the  bulk  of  the  research  in  the  field  has 
not  been  raethodologically  strong — may  be  the  most  important  one  to  come  out 
of  this  integrative  review.  Many  implications  for  sounder  research  and 
research  reporting  are  embedded  in  our  Results  chapters.  Probably  none  is 
more  important  than  the  obvious  need  for  programmatic  research/  for  planned 
series  of  replications  rather  than  helter-skelter  studies  depending  upon  the 
availability  of  samples  and  situations  and  the  researchers*  (particularly 
graduate  students  and  their  advisors)  particular  interests  at  that  time. 

A  major  lesson  from  this  integrative  review/  then,  is  that  meta-analysis 
is  not  likely  to  be  an  adequate  substitute  for  programmatic/  replication- 
oriented  research  in  this  field.  With  the  multitude  of  potentially  relevant 
variables  in  the  area  of  att^j^tude  modification  and  the  nesting  of  treatments 
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within  levels  of  those  variables/  it  was  not  possible  to  isolate  through  post 
hoc  analysis  the  possible  concomitant  and  contaminating  factors  that  might 
account  for  the  equivocal,  even  contradictory/  results  from  the  body  of 
primary  studies. 

At  the  same  time/  another  dilemma  has  been  posed.  It  is  that  even  with 
careful  attention  to  design/  it  is  difficult  to  conduct  valid  studies  in  an 
applied  area  such  as  modifying  attitudes  toward  disabled  persons. 
Unfortunately/  it  only  takes  one  serious  threat  to  internal  validity  to 
invalidate  the  results  from  an  otherwise  strong  study.  This  frailty  of 
research  designs  in  applied  human  fields  makes  the  call  for  replication  all 
the  more  urgent.  It  will/  however/  be  especially  important  that  those  who 
carry  out  replications  attend  carefully  to  methodology  so  that  the  same 
design  weaknesses  will  not  be  pervasive/  thus  producing  a  systematic  bias  in 
findings  across  studies. 

Lessons  About  Integrative  Reviews 

Jackson  (1978/  1980)  has  enunciated  well  the  position  that  the 
procedural  steps  involved  in  conducting  an  integrative  review  are  analogous 
to  those  for  a  piece  of  primary  research.  That  position  served  as  the 
conceptual  frame  for  the  review  of  prior  reviews  in  Chapter  2  and  for  the 
approach  to  our  integrative  review. 

As  researchers  with  considerable  prior  experience  in  conducting  primary 
research  studies/  however/  we  have  discerned  one  important  difference/  at 
least  from  primary  research  studies  that  involve  the  typical  collection  of 
pre-  and  posttest  data  using  a  group-administered  test/  without  the  use  of 
observational  or  other  intensive  data  collection  techniques  to  assess  either 
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treatment  implementation  or  outcomes.  That  is,  doing  a  proper  meta-analysis 
is  very  labor  intensive  and/  with  a  large  number  of  studies,  very  time 
consuming.  The  development  of  a  Cuding  instrument  (categories,  definitions 
for  categories,  and  conventions  for  using  the  cacegories)  is  as  demanding  as 
the  development  of  any  complex  instrument  which  multiple  observers  must  use 
reliably.  (Fortunately,  much  tima  and  effort  were  saved  for  us  by  the 
availability  of  the  expertise  and  prior  work  of  Karl  White  and  Glendon  Casto 
of  Utah  State  University's  Early  Intervention  Research  Institute.)  Coding 
itself,  as  we  have  detailed  in  Chapter  3,  is  a  time-consuming  process — and 
one,  our  experience  confirmed,  in  which  senior  staff  should  be  involved. 

As  we  noted  at  the  beginning  of  Chapter  5,  however,  a  major  difference 
b2tween  a  properly  conducted  integrative  review  and  a  typical  piece  of 
primary  research  is  the  complexity  of  data  analysis,  if  one  has  gathered  data 
with  a  coding  instrument  which  validly  reflects  the  complexity  of  research 
contexts  and  situations.  Despite  the  advice  of  experienced  meta-analysts,  we 
underestimated  the  time  that  would  be  needed  to  iully  analyze  our  data. 

Of  course,  an  integrative  review  with  a  population  of  studies  as 
extensive  and  complicated  as  ours  should  not  be  thought  of  as  a  task  to  be 
completed  within  a  limite*^,  period  of  time,  defined,  for  example,  by  a  funding 
period.  As  journal  articles  and  farther  reports  to  professi^-nal  groups  are 
prepared,  analytic  suggestions  made  in  Chapters  5  and  6  will  be  followed  up 
and  other  analyses  conducted.  Of  particular  interest  will  be  more  fine- 
grained analyses  of  attitude  modification  techniques  by  such  variables  as 
grade-age  level  anc*  disability  attitude  targets,  in -an  attempt  to  discern  any 
regularitif.s  which  we  did  not  detect  through  the  analyses  which  underlay  this 
report . 
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We  have  emphasized  the  need  for  not  only  better  designed  research 
studies  but  for  a  more  productive  research  strategy,  i*e»f  replication/  in 
the  investigation  of  modifying  attitudes  toward  persons  with  disabilities* 
We  have  also  alluded  to  the  possibility  that  the  internal  validity  of 
attitude  modification  studies  in  this  ccea  may  be  intrinsically  frail/ 
because  of  the  difficulties  involved  in  studying  such  phenomena  in  applied 
settings*  If  that  is  the  case,  even  with  careful  replication  the 
accumulation  of  findings  that  indicate  clearly  what  attitude  modification 
techniques  are  most  effect ive,  or  which  are  most  effective  with  which  types 
of  persons  for  changing  attitudes  toward  what  types  of  disabilities/  may  turn 
out  to  be  a  difficult/  if  not  impossible/  goal  to  attain.  That  state  of 
affairs  may  explain  the  results  of  this  review  and  of  the  more  limited 
reviews  that  preceded  it* 

Another  possibility  has  to  be  considered  as  well/  however.  That  is/ 
that  the  state  of  research  identified  in  earlier  limited  reviews  of  the 
literature  and  confirmed  by  our  comprehensive/  quantitative  review  is  not  a 
function  of  either  poor  design  or  inherent  methodological  deficiencies/  but  a 
reflection  of  reality.  For  example/  Cronbach  (1975)  has  argued  that  complex 
interactions  among  variables  i^i  the  natural  sta  of  affairs  with 
psychological  phenomena.  He  has  suggested  that  the  complications  are  so  great 
that  when  researchers  begin  to  attend  to  all  of  the  potential  interactions/ 
they  enter  a  metaphorical  "hall  of  mirrors".  Also  arguing  for  a 
nonsimplistic  view  of  human  behavior/  Perrow  (1981)  has  contended  that 
social-psychological  phenomena  may  be  much  less  amenable  to  systemization/ 
and  much  more  unpredictable/  than  most  of  those  engaged  ir  "social  science" 
research  are  willing  to  admit. 
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Physicists  David  Crutchfield  and  his  associates  (1986)  have  cast  the 
lack  of  predictability  of  some  phenomena  in  scientific  terms.  Based  on  their 
analyses  of  physical  phenomena,  they  argue  that  "chaos" — that  is,  randomness 
generated  according  to  orderly  principles — may  be  a  fundamental  restriction 
on  our  ability  to  develop  cause-and-ef feet  conclusions  in  some  areas  of 
study.  That  is,  contrary  to  the  frequent  assumption  that  the  prediction  of 
such  phenomena  as  the  weather,  the  flow  of  water  in  a  mountain  stream,  and 
humc^n  behavior  is  possible,  if  only  sufficient  am^-^unts  of  information  can  be 
gathered  and  processed,  they  contend  that  randomness,  or  chaos,  may  be 
fundamental. 

Chaos  comes  about  according  to  understandable  rules  that  are  not  in 
themselves  based  on  chance,  ard  the  chaotic  behavior  is,  itself,  lawful. 
However,  with  chaotic  events — such  as  weathei:  changes,  turbulence  of  a 
mountain  stream,  and  perhaps  some  areas  of  human  behavior — "small 
uncertainties  are  amplified"  and  behavior  becomes  unpredictable.  A  speck  of 
dust  observed  under  a  microscope  moves  continuously  and  erratically  because 
it  is  bombarded  by  surrounding  water  molecules  caught  in  thermal  motion.  As 
Crutchfield  et  al.  put  it,  "the  web  of  causal  influences  among  the  subunits 
can  become  so  tangled  .nat  the  resulting  pattern  of  behavior  becomes  quite 
random"  (p.  46). 

Chaotic  behavior  may  stem  from  initially  simple  interactions  between  a 
few  componentc.  The  individual  elements  are  simple  and  rule-consistent;  but 
complex  interactions  make  prediction  impossible,  with  exponential  growth  in 
the  inability  to  predict  reliably  as  the  chaotic  behavior  continues.  Human 
creativity,  Crutchfield  and  his  co-authors  note,  may  reflect  and  contribute 
to  chaos  in  human  behavior,   as  small  fluctuations  in  thinking  are  amplified 
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and  molded  inco  "macroscopic  coherent  mental  states  that  are  experienced  as 
thoughts"  (p*  57). 

Are  attitudes  and  attitude  change  chaotic  phenomena?  That  is,  are 
attitudes  pertubations  that  result  from  interactions  among  relatively  few  and 
simple  elements  whose  effects  are  magnified  by  the  cognitive  power  of  the 
mind,  so  that  the  attitude-related  behavior  has,  for  the  purposes  of 
scientific  systemization,  large  random  elements  to  it?  Chaos  would  explain 
the  consistent  conclusions  in  reviews  of  the  liter  iture  of  equivocal 
findings.  Improvements  in  research  are  necessary,  as  sketched  out  above/ 
before  it  will  be  possible  to  know  whether  the  observed  state  of  affairs  in 
research  on  modifying  attitudes  toward  disabled  persons  is  an  artifact  of 
methodology  or  a  natnral  state  of  affairs. 

In  any  event,  the  prognostication  need  not  be  pessimistic.  On  the  one 
hand/  with  improved  research  methods,  researchers  may  be  able  to  determine 
how  to  have  the  desired  influence  on  attitudes  toward  persons  with 
disabilities.  On  the  other  hand,  should  it  emerge  that  we  are  dealing  with 
chaotic  phenomena/  research  efforts  could  be  directed  toward  understanding 
how  to  channel  behavior  based  on  negative  attitudes  toward  disabled  persons 
so  that  such  attitudes  will  not  have  destructive  effects,  much  as — to  use  a 
turbulent  water  metaphor — researchers  continue  to  study  not  only  the  factors 
that  affect  water  movement  but  how  to  channel  water  to  avoid  its  destructive 
forces,  even  though  they  are  unable  to  predict  the  mo^^ions  of  individual 
molecules  or  droplets  of  turbulent  water. 

Channelling  of  water  is  done  to  reduce  the  effects  of  turbuJence. 
Similarly/  we  have  seen  that  the  channelling  of  attitude- related  benavior — 
for  example/   by  Public  Law  94-142,   judicial  decisions,  and  the  policies  and 
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actions  of  public  and  private  agencies — can  have  (as  it  has  in  race 
relations)  positive  long-term  impacts  on  attitudes  while  delimiting  the 
potentially  restrictive  and  dehumanizing  effects  of  discriminatory  behavior. 
Research  into  the  modification  of  attitudes  should  be  more  methodologically 
and  strategically  sound  in  order  to  improve  the  likelihood  that  it  will  lead 
to  better  understanding  of  the  phenomenon;  yet/  how  to  limit  the  debilitating 
effects  of  negative  attitudes  may  be  an  equally/  if  not  more/  important 
policy  concern  and  productive  line  of  inquiry. 


cn?r  329  298 


References 

Abroms/  Kippy/  &  Kodera/  Thomas.  (1979),  Acceptance  hierarchy  of  handicaps: 
Validation  of  Kirk's  stauement/  "Special  education  often  begins  where 
medicine  stops."    Journal  of  Learning  Disabilities/  12,  15-- 20. 

Alexander/  C./  &  Strain/  P.  (1978).  A  review  of  educators'  attitudes  toward 
handicapped  children  and  the  concept  of  mainstreaming.  Psychology  in  the 
Schools/  15^/  390396. 

Allport/  Gordon  W.  (1954).  The  nature  of  prejudice.  Garden  City;  NY: 
Doubleday. 

Aloia/  G./  Beaver/  EL/  &  Pettus/  W.  (1978).  Increasing  initial  interactions 
among  integrated  EMR  students  and  their  nonretarded  peers  in  a  game-playing 
situation.    American  Journal  of  Mental  Deficiency/  82/  573-579. 

Altrocchi/  J./  &  Eisdorfer/  C.  (1961).  Changes  in  attitudes  toward  mental 
illness.    Mental  Hygiene/  45/  563-570. 

Amir/  Yehuda.  (1969).  Contact  hypothesis  in  ethnic  relations. 
Psychological  Bulletin/  71(5)/  319-342. 

Anthony/  W.  (1969).  The  effects  of  contact  on  an  individual's  attitude 
toward  disabled  persons.    Rehabilitation  Counseling  Bulletin/  12/  168-170. 

Anthony/  W.  (1972).  Societal  rehabilitation:  Changing  society's  attitudes 
toward  the  physically  and  mentally  disabled.  Rehabilitation  Psychology/ 
19/  117-126. 

Antonak/  Richard  F.  (1980).  A  hierarchy  of  attitudes  toward  exceptionality. 
The  Journal  of  Special  Education/  14/  231-241. 

Antonak/  Richard  F.  (1986/  June).  Methods  to  measure  attitudes  toward 
people  who  are  disabled.  Paper  presented  at  the  Conference  on  Attitudes 
Toward  Persons  with  Disabilities/  Hofstra  University/  Hofstra/  New  York. 

Bali-Rokeach/  Sandra  J./  Rokeach/  Milton/  &  Grube/  Joel  W.  (1984).  The 
great  American  values  test:  Influencing  behavior  and  belief  through 
television.    New  York:    The  Free  Press. 

Banger t-Drowns/  Robert  L.  (1986).  Review  of  developments  in  meta^-analytic 
method.    Psychological  Bulletin/  99(3)/  388-399. 

Begab/  M.  (1969).  The  effect  of  differences  in  curricula  and  experiences  on 
social  work  students'  attitudes  and  knowledge  about  mental  retardation. 
Dissertation  Abstracts/  29/  4111-4112. 

Bemotavicz/  Freda.  (1979)*.  Changing  attitudes  towards  the  disabled  through 
visual  presentations:  What  the  research  sa'ys^  ^Report  to  the  National 
Institute  of  Handicapped  Research/  Grant  No.  12-P-59142).  The  Disability/ 
Attitudes  and  Media  Project/  Human  Services  Development  Institute/  Center 
for  Research  and  Advanced -Study/  University  of  Southern  Maine. 

ERIC  299 


Bogdan/  Robert/  &  Biklen/  Douglas.  (1977).  Handicapism.  Social  Policy^ 
7(5),  14-19. 

Borg,  Ww  &  Gall,  M.  (1983).  Educational  Research  (4th  ed.).  New  York: 
Longman. 

Bowe,  Frank*  (\978).  Handicapping  America;  Barriers  to  disabled  persons. 
New  YoiTk:    Harper  &  Row. 

Bowe/  Frank.  (1980).  Rehabilitating  America:  Toward  independence  for 
disabled  and  elderly  people.    New  York:    Harper  &,  Row. 

Bracey/  Gerald  W.  (1986).  Research^  Tips  for  readers  of  research.  Phi 
Delta  Kappan/  67,  pp.  395-6. 

Bracht/  Glenn  H.,  &  Glass,  Gene  V.  (1968).  The  external  validity  of 
experiments.    American  Educational  Research  Journal,  5^(4),  437-474. 

Bradfield,  R.  H.,  Brown,  J.,  Kap'an,  P.,  Rickert,  E,.,  &  Stannard,  R.  (1973). 
The  special  child  in  the  regular  classroom.  Exceptional  Children,  39,  384- 
390. 

Brennan,  J.,  &  Margolin,  R.  (1954).  Utilizing  community  resources  for 
rehabilitation  of  psychiatric  patients.  Personnel  and  Guidance  Journal, 
32,  330-335. 

Brooks,  B.,  &  Bransford,  L.  (1971).  Modification  of  teachers'  attitudes 
toward  exceptional  children.    Exceptional  Children,  38,  259-260. 

Bryant,  Fred  B.,  &  Wortman,  Paul  M.  (1985).  Methodological  issues  in  the 
meta-analysis  of  quasi-experiments.  Evaluation  Studies  Review  Annual,  10, 
629-648 • 

Bullock,  R.  J.,  &  Svyantek,  Daniel  J.  (1985).  Analyzing  meta-analysis: 
Potential  problems,  an  unsuccessful  replication,  and  evaluation  criteria. 
Journal  of  Applied  Psychology,  70(1),  108-115. 

Campbell,  Donald  T.,  &  Stanley,  Julian  C.  (1963).  Experimental  and  quasi- 
experimental  designs  for  research.    Chicago:    Rand  McNally. 

Carlberg,  Conrad  G.,  Johnson,  David  W.,  Johnson,  Roger,  Maruyama,  Geoffrey, 
Kavale,  Kenneth,  Kulik,  Chen-Lin  C,  Kulik,  James  A.,  Lysakowski,  Richard 
S.,  Pflaum,  Susanna  W.,  &  Walberg,  Herbert  J.  (1984).  Meta-analysis  in 
educa^ion:    A  reply  to  Slavin.    Educational  Re^^earcher,  L3(8),  16-23. 

Carver,  Ronald  P.  (1978).  The  case  against  statistical  significance 
testing.    Harvard  Educational  Review,  48(3),  378-399. 

Chubon,  R.  (1982).  An  analysis  of  research  dealing  with  the  attitudes  of 
professionals  toward  disability.    Journal  of  Rehabilitation,  48(1),  25-29. 

Cleland,  C  &  Chambers,  W.  (1959).  Experimental  modification  of  attitudes 
as  a  function  of  an  instructional  tour.  American  Journal  of  Mental 
Deficiency,  64,  124-130. 


O  300 

tiyc  331 


Clore/  G./  &  Jeffrey/  K.  11972).  Emotional  role- playing/  attitude  change/ 
and  attraction  toward  a  disabled  person.  Journal  of  Personality  and  Social 
Psychology/  23/  105-111. 

Cohen/  Jacob.  (1977).  Statistical  power  analysis  for  the  behavioral 
sciences.    New  York:    Academic  Press. 

Cole,  F.  (1971)r  Contact  as  a  determinant  of  sighted  persons'  attitudes 
toward  the  blind.    Dissertation  Abstracts/  31y  6892-6893. 

Cook/  Thomas  D./  &  Campbell/  Donald  T.  (1979).  Quasi-experimentation: 
Design  £  analyei?  issues  for  field  settings.    Chicago;     Rand  McNally. 

Cook/  Thomas  D./  &  Leviton/  Laura  C.  (1980).  Reviewing  the  literature:  A 
comparison  of  traditional  methods  with  meta-analysis.  Journal  of 
Personality/  48/  449-472. 

Cooper/  Harris  M.  (1982).  Scientific  guidelines  for  conducting  integrative 
rejearch  reviews.    Review  of  Educational  Research/  52/  291-302. 

Cooper/  Harris  M.  (1984).  The  integrative  re^L^earch  review:  A  systematic 
approach.    Beverly  Hills;  Sage. 

Cooper/  Harris  M./  &  Rosenthal-  Robert.  (1930).  Statistical  versus 
traditional  procedures  v  summarizing  research  findings.  Psychological 
Bulletin/  87(3)/  442-449. 

Cowen/  E./  Underberg/  R./  &  Verrillo/  R.  (1958).  The  development  and 
testing  of  an  attitude  to  blindness  scale.  The  Journal  of  Social 
Psychology/  48/  297--3C 

Cronbach/  Lee  J*  (1975).  Beyond  the  two  disciplines  of  scientific 
psychology.    American  Psychologist/  30/  116-127. 

Crutchfield/  James  P./  Farmer/  J.  Doyne/  Packard/  Norman  H./  &  Shaw/  Robert 
S.    (1986).    Chaos.    Scientific  American/  255(6)/  46-57. 

Dahl/  H./  Horsman/  K./  &  Arkell/  R.  (1978).  Simulation  of  exceptionalities 
for  elementary  school  students.    Psychological  Reports/  42/  573-574. 

Donaldson/  Joy.  (1974).  Effects  of  live,  video /  and  audio  presentations  by 
a  panel  of  physically  disabled  individuals  on  attitudes  toward  disabled 
persons.    Unpublished  doctoral  dissertation/  University  of  Kentucky. 

Donaldson/  Joy.  (1976).  Channel  variations  eind  effects  on  attitudes  toward 
physically  disabled  individuals.  kudio-Visual  Communication  Revi<^w/  24/ 
135-144. 

Donaldson/  Joy.  (1980).  Changing  attitudes  toward  handicapped  persons:  A 
review  and  analysis  of  research*    Exceptional  Children/  46/  504-513. 

Donaldson/  J.,  &  Marcinson/  M.  (1977).  Modifying  attitudes  toward 
physically  disabled  persons.    Exceptional  Children/  44/  337-341. 


O  Q  O 


Dye/  Celeste  Pu  (1978).  Effects  of  persuasioi  and  autotelic  inquiry  Tiethods 
on  attitude  change.    Perceptual  and  Motor  SKills/  47,  943-949. 

Dye/  Celeste  A.  (1980).  Autotelic  inquiry:  A  learning  approach  to  attitude 
change.    Educational  Gerontology/  5^/  239-248, 

English  R.  l^^illiam.  (1966).  Assessment/  modification/  and  stability  of 
attitudes  toward  blind  persons.  Unpublished  master's  thesis/  Soutnern 
Illinois  University. 

English/  R.  William.  (1971).  Correlates  of  stigma  towards  physically 
disabled  persons.    Rehabilitation  Research  and  Practice  Review/  2^(4)/  1-17. 

Esposito/  Beverly  G./  &  Peach/  Walter  J.  (1983).  Changing  attitudes  of 
preschool  children  toward  handicapped  persons.  Exceptional  Children/ 
49(4)/  361-363. 

EspositO/  Beverly  G./  &  Reed/  Thomas  M./  II.  (1986).  The  effects  of  contact 
with  handicapped  persons  on  young  children's  attitudes.  Exceptional 
Children/  53(3)/  224-229. 

Evans/  J.  (1976).  Changing  attitudes  toward  disabled  persons:  An 
experimental  study.    Rehabilitation  Counseling  Bulletin.  19/  572-579. 

Eysenck/  H.  J.  (1978).  An  exercise  in  mega-silliness.  American 
Psychologist/  33/  517. 

Feldman/  K.  A.  (1971).  Using  the  work  of  others:  Some  observations  on 
reviewing/  integrating/  and  consolidating  findings.  Sociology  of 
Education/  44/  86-102. 

Fenton/  T.  (1975).  The  effects  of  inservice  training  on  elementary 
classroom  teachers'  attitudes  toward  and  knowledge  about  handicapped 
children.    Dissertation  Abstracts  Internationa If  35/  5966A. 

Forader/  A.  (1970).  Modifying  social  attitudes  toward  the  physically 
disabled  using  three  different  modes  of  instruction.  Dissertat :,on 
Abstracts/  30/  4360B. 

Frith/  Greg  H./  &  Mitchell/  Janet  W.  (1981).  The  attitudes  of  nonhandicapped 
students  toward  the  mildly  retarded:  A  consideration  in  placement 
decisions.    Education  and  Training  of  the  Mentally  Retarded/  16_/  79-83. 

Gage/  N.  L.  ( 1982).  The  future  of  educational  research.  Educational 
Researcher/  11^(8)/  11-12. 

Gallo/  P.  S.  ( 1978 Meta-analysis — A  mixed  meta-phor.  American 
Psychologist/  33/  515-517. 

Gay/  L.  (1976).  Educational  research:  Competencies  for  analysis  and 
application.    Columbus/  OH:    Charles  E.  Merrill. 


Er|c  333 


ERIC 


Glass,  Gene  V,  (1976).  Primary/  secondary,  and  meta-analysis  of  research. 
Educational  Researcher,  5^(10),  3-8. 

Glass,  Gene  V.  (1977).  Integrating  findings:  The  meta-analysis  of  research. 
Review  of  Research  in  Education,  5^,  351^379. 

Glass,  Gene  V.  (1978).  Reply  to  Mansfield  and  Bussey.  Educational 
Researcher,  Ij  3. 

Glass,  Gene  V.  (1980,  December).  On  criticism  of  our  class  size/student 
achievement  research:    No  points  conceded.    Phi  Delta  Kappan,  242-244. 

Glas3,  Gene  V,  McGaw,  B.,  &  Smith,  M.  (1981).  Meta-analysis  in  social 
research^    Beverly  Hills,  CA:  Sage. 

Glass,  Gene  V,  &  Smith,  Mary  Lee.  (1978).  Reply  to  Eysenck*  American 
Psychologist,  33,  517-518. 

Glass,  R.,  &  Meckler,  R.  (1972).  Preparing  elementary  teachers  to  instruct 
mildly  handicapped  children  in  regular  classrooms:  A  summer  workshop. 
Exceptional  Children,  39,  152-156. 

Goodman,  H.,  Gottlieb,  J*,,  &  Harrison,  R.  (1972).  Social  acceptance  of 
EMR's  integrated  into  a  non-graded  elementary  school.  American  Journal  of 
Mental  Deficiency,  76,  412-417 . 

Gottlieb,  Jay.  (1972).  Bi-cultural  study  of  attitude  change  and  behavior 
toward  retardates.    Unpublished  doctoral  dissertation,  Yeshiva  University. 

Granofsky,  J.  (1953).  Modification  of  attitudes  toward  the  visibly 
disabled:  An  experimental  study  of  the  effectiveness  of  social  contact  in 
producing  a  modification  of  the  attitudes  of  non-disablevi  females  toward 
visibly  disabled  males.    Dissertation  Abstracts,  16,  1182-1183. 

Green,  Bert  F.,  &  Hall,  Judith  A.  (1984).  Quantitative  methods  for 
literature  reviews.    Annual  Review  of  Psychology,  35,  37-53. 

Greenstein,  Theodore.  (1976).  Behavior  change  through  value  self- 
confrontation:  A  field  experiment.  Journal  of  Personality  and  Social 
Psychology,  34(2),  254-262. 

Haddle,  H.  (1974).  The  modification  of  att?  .  ^des  toward  disabled  persons: 
Tne  case  for  using  systematic  desensitization  as  an  attitude-change 
strategy.    American  Foundation  for  the  Blind  Research  Bulletin,  No.  28 . 

riafer,  M.,  6c  Narcus,  M.  (1979).  Information  and  attitudes  toward 
disability.    Rehabilitation  Counseling  Bulletin,  23(2),  95-102. 

Harasymiw,  S.,  &  Horne,  M.  (1975).  Integration  of  handicapped  children: 
Its  effect  on  teacher  attitudes.    Education,  96,  152-155. 

Harth,  R.  (1973).  Attitudes  and  mental  retardation:  Review  of  the 
literature.    Training  School  Bulletin,  69,  150-164. 

o  .1  1 

303 


ERIC 


Hays/  William  L.  (1973).  Statistics  for  the  social  sciences  (2nd  ed.).  New 
York:    Holt/  Rinehart  and  Winston. 

Hedges/  Larry  V.  (1981).  Distribution  theory  for  Glass's  estimator  of 
effect  size  and  related  estimators.  Journal  of  Educational  Statistics/ 
6(2)/  107-128. 

Hedges/  Larry  V./  &  Olkin/  Ingram.  (1980).  Vote-counting  methods  in 
research  synthesis.    Psychological  Bui letin/  88/  359-369. 

Hedges/  Larry  V./  &  Olkin/  Ingram.  (1985).  Statistical  methods  for  meta- 
analysis.   New  York:    Academic  Press. 

Hedges/  Larry  V./  &  Olkin/  Ingram.  (1986).  Meta  analysis:  A  review  and  a 
new  view.    Educational  Researcher/  15{Q) f  14-21. 

Hersh/  A./  Carlson/  R./  &  Lossino/  D.  (1975).  Normalized  interaction  with 
families  of  the  mentally  retarded.    Mental  Retardation/  15/  32-33. 

Hicks/  J./  &  Spaner/  F.  (1962).  Attitude  change  and  mental  hospital 
experience.    Journal  of  At.ionnal  and  Social  Psychology/  65/  112-120. 

Holzberg/  J.,  &  Gewirtz/  H.  (1963).  A  method  of  altering  attitudes  toward 
mental  illness.    Psychiatric  Quarterly  Supplement/  37/  56-61. 

Hopkins/  Kenneth  D.  (1982).  The  unit  of  analysis:  Group  means  versus 
individual  observations.  American  Educational  Research  Journal/  19(1)/  5- 
18. 

Horne/  M.  (1979).  Attitudes  and  mainstreaming:  A  literature  review  for 
school  psychologists.    Psychology  in  the  Schools/  16/  61-67. 

Horne/  Marcia  D.  (1985).  Attitudes  toward  handicapped  students: 
Professional/  peer,  and  parent  reactions.  Hillsdale/  NJ:  Lav;rence 
Erlbaum. 

Horne/  Marcia  D.  (1986/  June).  Modifying  peer  attitudes  toward  the 
handicapped;  Procedures  and  research  issues.  Paper  presented  at  the 
Conference  on  Attitudes  Toward  Persons  With  ' Disabilities/  Hofstra 
University/  Hempstead/  NY. 

Hovland/  C./  Janis/  I./  &  Kelley/  H.  (1953).  Communication  and  persuasion. 
New  Haven:    Yale  University  Press. 

Hunter/  John  E./  Schmidt/  Frank  L.^r  &  Jackson/  Gregg  B.  (1982).  Meta- 
analysis; Cumulating  research  findings  across  studies.  Beverly  Hills: 
Sage. 

Hyde/  J.  S.  (1981).  How  large  are  cognitive  gender  differences?  .  A  meta- 
analysis using  o)^  and  d.    American  Psychologist/  36/  892-901. 

Insko/  Chester  A.  (1967).  Theories  of  attitude  change.  New  York:  Meredith 
Publishing. 


304 

335 


Jackson/  Gregg  B.  (1978)*  Methods  for  reviewing  and  integrating  research  in 
the  social  sciences^  Pinal  Technical  Report  to  National  Science  Foundation 
for  Grant  #DIS  T6-20398*  Washington/  DC:  The  George  Washington 
University/  Social  Research  Group. 

Jackson/  G*  (1980)*  Methods  for  integrative  reviews*  Review  of  Elducational 
Research/   50/  438-460* 

Johannsen/  W*  (1969)*  Attitudes  toward  mental  patients:  A  review  of 
empirical  research*    Mental  Hygiene/  53/  218-228* 

Jones/  Reginald  L.  (Ed*)*  (1984)*  Attitudes  and  attitude  change  in  special 
education:  Theory  and  practice*  Reston/  VA:  The  Council  for  Exceptional 
Children, 

Kearney/  N*/  &  RocciO/  P*  (1956)*  The  effect  of  teacher  education  on  the 
teacher's  attitudes*    Journal  of  Educational  Research/  49/  703-708* 

Kiesler/  Charles  A*/  ColliaS/  Barry  E*/  &  Miller/  Norman*  (1969)*  Attitude 
change:  A  critical  analysis  of  theoretical  approaches*  New  York:  John 
Wiley  &  Sons* 

Kirby/  F*/  &  Toler/  fL  (1970)*  Modification  of  pre-school  isolate  behavior: 
A  case  study*    Journal  of  Applied  Behavior  Analysis/  3^,  309-314* 

Kleinfield/  Sonny*  (1979)*  The  hidden  minority:  A  profile  of  handicapped 
Americans*    Boston:    Little/  Brown/  and  Company* 

Kuhn/  J*  (1971)*  A  comparison  of  teachers'  attitudes  toward  blindness  and 
exposure  to  blind  children*    The  New  Outlook  for  the  Blind/  65/  337-340* 

Kulik/  James  Pu,  &  Kulik/  Chen-Lin  C  (1986/  April)*  Operative  and 
interpretable  effect  sizes  in  meta-analysis*  Paper ''presented  at  the  annual 
meeting  of  the  American  Educational  Research  Association/  San  Francisco* 

La  Hue/  A*  (1959)*  Teacher  classroom  attitudes*  Journal  of  Teacher 
Education/  10/  433* 

Ladas/  H*  (1980)*  Summarizing  research:  A  case  study*  Peview  of 
Educational  Research/  50(4)/  597-624* 

Lane/  P.  (1976)*  Evaluative  statements  by  prospective  students  as  a 
function  of  ethnic  and  retardation  labels*  Dissertation  Abstracts 
International /  37/  149 1-A* 

Lapp/  Bernard*  (1974)*  The  effects  of  behavior  modification  inservice 
training  on  teacher  behavior/  teacher  attitudes/  and  knowledge  of  behavior- 
modification*  Unpublished  doctoral  dissertation/  University  of 
Connecticut* 

Lazar/  Pu,  Gensley/  J*/  &  Orpet/  R.  (1971)*  Changing  attitudes  of  young 
mentally  gifted  children  toward  handicapped  persons*  Exceptional  Children/ 
37/  600-602. 


305 


33G 


Levitt/  Edith/  &  Cohen/  Shirley*  (1976)*  Attitudes  of  children  toward  their 
handicapped  peers.    Childhood  Education/  52/  171-173, 

Lewis/  !•/  &  Cleveland/  D*  (1966).  Nursing  students*  attitudinal  changes 
following  a  psychiatric  affiliation.  Journal  of  Psychldtric  Nursing/  4/ 
223-231. 

Leyser/  Y./  &  Gottlieb/  J.  (1980).  Improving  the  social  status  of  rejected 
pupils.    Exceptional  Children/  48/  459-461. 

Light/  Richard  J.  (1979).  Capitalizing  on  variation:  How  conflicting 
research  findings  can  be  helpful  for  policy.  Educational  Researcher/  8^(8)/ 
3-8. 

Light/  Richard  J./  &  Pillemer/  David  B.  (1962).  (Slumbers  and  narrative: 
Combining  their  strengths  in  research  reviews.  Harvard  Educational  Review/ 
52(1)/  1-26. 

Light/  Richard  J./  &  Pillemer/  David  B.  (1984).  Summing  up:  The  science  of 
reviewing  research.    Cambridge:    Harvard  University  Press. 

Light/  Richard  J./  &  Smith/  P.  (1971).  Accumulating  evidence:  Procedures 
for  resolving  contradictions  among  different  research  studies.  Harvard 
Educational  Review/  41^(4)/  429-471. 

Makas/  Elaine.  (1985/  August).  The  measurement  of  attitudes  toward  disabled 
people:  A^  new  approach.  Paper  presented  at  the  Annual  Convention  of  the 
American  Psychological  Association/  Los  Angeles. 

Makas/  Elaine.  (1986/  April).  The  relationship  between  contact  with  and 
attitudes  toward  people  with  disabilities;  A  question  of  theory  or  of 
method..  Paper  presented  at  the  annual  meeting  of  The  Sociecy  for  the  Study 
of  Chronic  Illness/  Impairment/  &  Disability/  Reno,  Nevada. 

Mansfield/  R.  S./  &  Bussey/  T.  V.  (1977).  Meta-ap=^lysis  of  research:  A 
rejoinder  to  Glass.    Educational  Researcher/  6^/  3. 

Marlowe/  M.  (1979).  The  games  analysis  intervention:  A  procedure  to 
increase  the  peer  acceptance  and  social  adjustment  of  a  retarded  child. 
Education  and  Training  of  the  Mentally  Retarded/  14/  262-268. 

Matkin/  Ralph  E./  Hafer/  Marilyn/  Wright/  W.  Russell/  &  Lutzker/  John  R. 
(1983).  Pretesting  artifacts:  A  study  of  attitudes  toward  disabilities. 
Rehabilitation  Counseling  Bulletin/  26(5)/  342-348. 

McGaW/  Barry/  &  Glass,  Gene  V.  (1980).  Choice  of  the  metric  for  effect  size 
in  meta-analysis.    American  Educational  Research  Journal/  17(3)/  325-337. 

MiddletoH/  j.  (1953).  The  prejudices  and  opinions  of  mental  hospital 
employees  regarding  mental  illness.  American  Journal  of  Psychiatry/  110/ 
133-138. 


Er|c  3.17 


Mitchell/  Marlys  M.  (1976).  Teacher  attitudes.    The  High  School  Journal/  59/ 
302-312. 


Mouly/  G.  '1978).  Educational  research:  The  art  and  science  of 
investigation.    Boston:    Allyn  and  Bacon. 

Mulkey/  Sidney  W.  (1980).  Standard  stimulus  effects  on  attitudes  toward 
disabled  persons:  An  experimental  study.  Unpublished  doctoral 
dissertation/  The  Florida  State  University. 

Murray/  J.  (1958).  An  experiment  in  changing  the  attitudes  of  employers 
toward  mental  illness.    Mental  Hygiene/  42/  402. 

Nunnally/  N.  (1961).  Popular  conceptions  of  mental  health.  New  York: 
Holt/  Rinehart  and  Winston. 

Oberle/  Judson  B.  (1975).  The  effect  of  personalization  and  quality  of 
contact  on  changing  expressed  attitudes  and  hiring  preferences  toward 
disabled  persons.    Unpublished  doctoral  dissertation/  Syracuse  University. 

Orwin/  Robert  G./  &  Cordray/  David  S.  (1985).  Effects  of  reporting  on  meta- 
analysis: A  conceptual  framework  and  reanalysis.  Psychological  Bulletiii/ 
97(1)/  134-147. 

Osgood/  G./  Succi/  G./  &  Tannebaum/  P.  (1957).  The  measurement  of  meaning. 
Urbana:    University  of  Illinois  Press. 

Ozyurek/  Mehmet.  (1977).  Effects  of  live,  audiO/  and  print  presentations  of 
£  discussion  about  physical  disabilities  on  attitude  modification  toward 
disabled  persons  in  Turkey.  Unpublished  doctoral  dissertation/  University 
of  Northern  Colorado . 

Parish/  T./  Eads/  G./  u  ^e,  N./  &  PiscitellO/  M.  (1977).  Assessment  and 
attempted  modification  of  future  teachers'  attitudes  toward  handicapped 
children.    Perceptual  and  Motor  Skills/  44/  540-542. 

Perkins-Karniski/  M.  (1978).  The  effect  of  increased  knowledge  of  body 
systems  and  functions  on  attitudes  toward  the  disabled.  Rehabilitation 
Counseling  Bulletin/  22/  16-20. 

Perrow/  Charles.  (1981).  Disintegrating  social  sciences.  New  York 
University  Education  Quarterly/   12(2)/  2-9. 

Pettit/  L.  (1956).  Attitudes  of  relatives  of  long-hospitalized  mental 
patients  regarding  convalescent  leave.    Mental  Hygiene/  40/  251. 

Preece/  Peter  F.  W.  (1983).  A  measure  of  experimental  effect  size  based  on 
success  rates.    Educational  and  Psychological  Measurement/  43/  763-766. 

Pulton/  T.  (1976).  Attitudes  toward  the  physically  disabled:  A  review  and 
a  suggestion  for  producing  positive  attitude  change.  Physiotherapy  Canada/ 
28/  83-88. 


Quay/  L./  Bartlett,  C,  Wrightsman/  L./  &  Catron,  D.  (1961).  Attitude 
change  in  attendant  employees.  The  Journal  of  Social  Psychology,  55,  27- 
31. 

Rabkin,  J.  (1972).  Opinions  about  mental  illness:  A  review  of  the 
literature.    Psychological  Bulletin,  77^(3),  153-171, 

Rapier,  J.,  Adelson,  R.,  Carey,  R.,  &  Croke,  K.  (1972).  Changes  in 
children's  attitudes  toward  the  physically  handicapped.  Exceptional 
Children,  39,  219-223. 

Richardson,  S.  A.,  &  Ronald,  L.  (1977).  The  effect  of  a  physically 
handicapped  interviewer  on  children's  expression  of  values  toward  handicap. 
Rehabilitation  Psycnology,  24(4),  211-218. 

Richardson,  Stephen^  Goodman,  Norman,  Hastorf,  Albert,  &  Dornbusch,  Stanford. 
(1961).  Cultural  uniformity  in  reaction  to  physical  disabilities. 
American  Sociological  Review,  26,  241-247. 

Rokeach,  Milton.  (1968).  Beliefs,  attitudes,  and  values.  San  Francisco: 
Jossey-Bass,  Inc. 

Rokeach,  Wilton.    (1971).  Long-range  experimental  modification  of  values, 

attitudes,  and  behavior.  American  Psychologist,  26(5),  453-459* 

Rokeach,  Milton.  (1973).  The  nature  of  human  values.  New  York:  The  Free 
Press. 

Rokeach,  Milton,  &  McLellan,  D.  Daniel.  (1972).  T'eedback  of  information 
about  the  values  and  attitudes  of  self  and  others  as  determinants  of  long- 
term  cognitive  and  behavioral  change.  Journal  of  Applied  Social 
PsychoT^^y,  2(3),  236-251. 

Rosenthal,  Robert.  (1978).  Combining  results  of  independent  studies. 
Psychological  Bulletin,  85(1),  185-193. 

Rosenthal,  Robert.  (1984).  Meta-analytic  procedures  for  fiocial  research. 
Beverly  Hills,  CA:  Sage. 

Rosenthal,  Robert-  &  Rosnow,  Ralph  L.  (1975).  The  volunteer  subject.  New 
York:    John  Wiley  and  Sons. 

Rosenthal,  Robert,  &  Rubin,  Donald  B.  (1982).  Comparing  effect  sizes  of 
independent  studies.    Psychological  Bulletin,  92^(2),  500-504. 

Rosenthal,  Robert,  &  Rubin,  Donald  D.  (1982).  A  simple,  general  purpose 
display  of  magnitude  of  experimental  effect.  Journal  of  Educational 
Psychology,  74(2),  166-169. 

Rosnow,  Ralph  L.,  &  Rosenthal,  Robert.  (1976).  The  volunteer  subject 
revisited.    Australian  Journal  of  Psychology,  28,  97-108. 


^  308 

^  33'3 


Rosswurm#  M,  (1980),  Changing  nursing  students'  attitudes  toward  persons 
with  E*^ysical  disabilities.    ARN/  5^,  12-14. 

Rowe/  Joanne^  &  Stutts/  Rose  Marie.  (In  press).  Impact  of  experience  with 
disabled  individuals  on  teachers'  attitudes.  Aaapted  Physical  Activity 
Quarterly. 

Rusalerrif  H.  (1967).  Engineering  changes  in  public  attitudes  toward  a 
severely  disabled  group.    Journal  of  Rehabilitation ^  33(3),  26-27. 

Sadlickf  Marie,  &  Penta,  Frank  B.  (1975).  Changing  nurse  attitudes  toward 
quadriplegics  through  use  of  television.  Rehabilitation  Literature,  36/ 
274-278,  288. 

Sandler,  A.,  Robinson,  R.  (1981).  Public  attitudes  and  community 
acceptance  of  mentally  retarded  persons:  A  review.  Education  and  Training 
of  the  Mentally  Retarded,  16,  97-103. 

Segal,  S.  (1978).  Attitudes  toward  the  mentally  ili:  A  review.  Social 
Work,  23,  211-217. 

Sellin,  D.,  &  Mulchahay,  R,  (1965).  The  relationship  of  an  institutional 
tour  upon  opinions  about  mental  retardation.  American  Journal  of  Mental 
Deficiency,  70,  408-412. 

Shaver,  James  P.  (1979).  The  productivity  of  educational  research  and  the 
applied-basic  research  distinction.  Educational  Researcher,  8(1),  3-9. 
(a) 

Shaver,  James  P.  (1979).  The  usefulness  of  educational  research  in 
curricular/instructional  decision-making  in  social  studies.  Theory  and 
Research  in  Social  Education,  2(3)'  21-46.  (b) 

Shaver,  James  P.  (1980,  April).  Readdressing  the  role  of  statistical  tests  of 
significance.  Paper  presented  at  the  annual  meeting  of  the  American 
Educational  Research  Association,  Boston. 

Shaver,  James  P.  (1983).  The  verification  of  independent  variables  in 
teaching  methods  research.    Educational  Researcher,  12(8),  2-9. 

Shaver,  James  P.  (1985).  Chance  and  nonsense:  A  conversation  about 
interpreting  tests  of  statistical  significance.  Part  1.  Phi  Delta  Kappan, 
67(Sept.),  138-141.  (a) 

Shaver,  James  P.  (1985).  Chance  and  nonsense:  A  conversation  about 
interpreting  tests  of  statist iccil  significance.  Part  2.  Phi  Delta  Kappan, 
67(Oct.),  138-141.  (b) 

Shaver,  James  P.,  &  Curtis,  Charles  K.  (1981).  Handicapism  and  equal 
opportunity:  Teaching  about  the  disabled  in  social  studies.  Reston,  VA: 
Foundation  for  Exceptional  Children!  (a) 


309 


Shaver/  James  P.,  &  Curtis^  'harles  K.  (1981).  Handicapism:  Another 
challenge  for  social  studies     Social  Education/  45(3)/  208-211.  (b) 

Shaver/  James  P./  &  Norton/  Richard  S.  (1980).  Populations/  samples/ 
randomness/  and  replication  in  two  social  studies  journals.  Theory  and 
Research  in  Social  Education/  8^(2)/  1-10.  (a) 

Shaver/  James  P./  &  c^orton/  Richard  S.  (1980).  Randomness  and  replication 
in  ten  years  of  the  American  Educational  Research  Journal.  Educational 
Researcher/  9(1)/  9-15^  (b) 

Shotel/  J./  lano/  R./  &  McGettigan/  J.  (1974).  Teacher  attitudes  associated 
with  integration  of  handicapped  children.  In  G.  J.  Warfield  (Ed.)/ 
Mainstream  currents.    Reston/  VA:    Council  for  Exceptional  Children/  91-97. 

Simpson/  R./  Parrish/  N./  &  Cook/  J.  (1976).  Modification  of  attitudes  of 
regular  class  children  towards  the  handicapped  for  the  purpose  of  achieving 
integration.    Contemporary  Educational  Psychology/  1^/  46-51. 

Simpson/  S.  N.  (1980).  Comment  on  "meta-analysis  of  research  on  class-size 
and  achievement".    Educational  Evaluation  and  Policy  Analysis/  2^,  81-83. 

Siperstein/  Gary  N./  Bak/  John  J./  &  Gottlieb/  Jay.  (1977).  Effects  of 
group  discussion  on  children's  attitudes  toward  handicapped  peers.  The 
Journal  of  Educational  Research/  70/  131-134. 

Slavin/  Robert  E.  (1984).  Meta-analysis  in  education:  How  has  it  been  used? 
Educational  Researcher/  13^(8)/  6-15. 

Slavin/  Robert  E.  (1986).  Best-evidence  synthesis:  An  alternative  to  meta- 
analytic  and  traditional  reviews.    Educational  Researcher/  15(9)/  5-11. 

Smith/  Mary  Lee.  (1980).  Publication  bias  and  meta-analysis.  Evaluation  in 
Education/  £/  22-- 24. 

Smith/  Mary  Lee,  &  Glass/  Gene  V.  (1980).  Meta-analysis  of  research  on  class 
size  and  its  relationship  to  attitudes  and  instruction.  American 
Educational  Research  Journal/  17_(4)/  419-434. 

Stevens/  T./  &  Braun/  B.  (1980).  Measures  of  regular  classroom  teachers' 
attitudes  toward  handicapped  children.    Exceptional  Children/  46/  292-294* 

Stock/  William  A./  Okun/  Morris  A./  Haring/  Marilyn  J./  Miller/  Wendy/  & 
Ceurvorst/  Robert  W.  (1982).  Rigor  in  data  synthesis:  A  case  study  of 
reliability  in  meta-analyis.    Educational  Researcher/  11(6)/  10-14/  20. 

Strain/  P./  &  Timm/  M»  (1974).  An  experimental  analysis  of  social 
interaction  between  a  behaviorally  disordered  preschool  child  and  her 
classroom  peers.    Journal  of  Applied  Behavior  Analysis/  7^,  583-590. 

Strauch/  J.  (1970).  Social  contact  as  a  variable  in  the  expressed  attitudes 
of  normal  adolescents  toward  EMR  pupils.  Exceptional  Children/  36/  495- 
500. 


341 


Thompson^  Wayne  N,  (1975).  The  process  of  persuasion;  Principles  and 
readings.    New  York:    Harper  &  Row. 

Towner^  A.  (1984).  Modifying  attitudes  toward  the  handicapped:  A  review  of 
the  literature  and  methodology.  In  R.  Jones  (Ed.)/  Attitudes  and  attitude 
change  iji  special  education:  Theory  and  practice.  Res ton:  VA:  The 
Council  :or  Exceptional  Children. 

Triandis/  Harry  C.  (1971).  Attitude  and  attitude  change.  New  York:  John 
Wiley  &  Sons. 

Triandis/  Harry  Adamopouios,  John,  &  Brinberg,  David.  (1984). 
Perspectives  and  issues  in  the  study  of  attitudes.    In  Reginald  L.  Jones 

(Ed.)/  Attitudes  and  attitude  change  in  special  education:  Theory  and 
practice.    Reston,  VA:    The  Council  for  Exceptional  Children. 

Tringo/  John.  (1970).  The  hierarchy  or  preference  toward  disability  groups. 
The  Journal  of  Special  Education/  £/  295-306. 

Voeltz/  Luanna  M.  (1980).  Children's  attitudes  toward  handicapped  peers. 
American  Journal  of  Mental  Deficiency/  84/  455-464. 

Voeltz/  L.  (1982).  Effects  of  structured  interactions  with  severely 
handicapped  peers  on  children's  attitudes.  American  Journal  of  Mental 
Deficiency/  86/  380-390. 

Wagner/  Richard  V./  &  Sherwood/  John  J.  (Eds.).  (1969).  The  study  of 
attitude  change.    Belmont/  CA:  Brooks/Cole. 

Wallston/  B./  Blanton/  R./  Robinson/  J./  St  PoJlchink/  L.  (1972).  Community 
resources  development  in  rehabilitation  of  the  handicapped.  Nashville/  TO. 
(ERIC  Document  Reproduction  Service/  Document  No.  ED  078  239) 

Walberg/  Herbert  J.  (1986).  Syntheses  of  research  on  teaching.  In  Merlin 
C.  Wittrock  (Ed.)/  Handbook  of  research  on  teaching  (3rd  ed.)  (pp.  214- 
229).    New  York:    Macmillan  Publishing  Company. 

Warrep/  S./  Turner/  D./  &  Brody/  D.  (1964).  Can  education  students' 
atti*-udes  toward  the  retarded  be  changed?    Mental  Retardation/  2,  235-242. 

WattS/  William  A.  (1984).  Attitude  change:  Theories  and  methods.  In 
Reginald  L.  Jones  (Ed.)/  Attitudes  and  attitude  change  in  special 
education:  Theory  and  practice.  Reston/  VA:  Council  for  Exceptional 
Children. 

Westwood/  M./  Vargo/  3.,  &  Vargo/  F.  (1981).  Methods  for  promoting  attitude 
change  toward  and  among  physically  disabled  persons.  Journal  of  Applied 
rehabilitation  Counseling/  12^(4)/  220-225. 

White/  K./  Bush/  D./  &  Casto/  G.  (1985-86).  Learning  from  reviews  of  early 
intervention.  The  Journal  of  Special  Education/   19(4)/  417-428. 


WhitG/  Karl  R./  &  CastO/  Glendon.  (1985).  An  integrative  review  of  early 
intervention  efficacy  studies  with  at-risk  children:  Implications  for  the 
handicapped.  Analysis  and  Intervention  in  Developmental  Disabilities^  5/ 
7«31.  " 

White/  Karl  R*/  Myette/  Beverly/  Baer,  Ric  rd/  &  Taylor/  Cie.  (1982).  A 
meta-analysis  of  previous  research  on  the  treatment  of  hyperactivity  (Tinal 
Report  for  Grant  #  HEW/OE/NIE-G-80-0008yi  Logan/  UT:  exceptional  CI  \ld 
Center/  Utah  State  University.  (ERIC  Document  Reproduction  Service  No.  ED 
224  218) 

Willson/  Victor  L./  &  Putnam/  Richard  R.  (1982).  A  meta-analysis  of  pretest 
sensitization  effects  in  experimental  design.  American  Educational 
Research  Journal/  19(2)/  249-258. 

Wilson/  fc;./  &  Alcorn/  D.  (1969).  Disability  simulation  and  development  of 
attitudes  toward  the  exceptional.  The  Journal  of  Special  Education/  3/ 
303^307.  ^ 

Wilson/  G.  Terence/  &  Rachman/  S.  J.  (1983).  Meta-analysis  and  the 
evaluation  of  psychotherapy  outcome:  Limitations  and  liabilities.  Journal 
of  Consulting  and  Clinical  Psychology/  Sl(l)/  54-64. 

Wolf/  Fredric  M.  (1986).  Meta-analysis:  Quantitative  methods  for  research 
synthesis.  Sage  University  Paper  series  on  Quantitative  Applications  in 
the  Social  Science.*:/  07-001.    Beverly  Hills:    Sage  Publicat.Lons. 

Wolins/  L.  (1962).  Responsibility  for  raw  data.  American  Psychologist/  17/ 
657-658. 

Wyrick/  J.  (1968).  The  effect  rf  the  lecture  method  on  attitudes  toward 
physical  disability.    Unpublished "master's  thesis/  University  of  Kansas. 

Yerxa/  E.  (1971).  The  effects  of  a  dyadic/  self-admi.iistered  instructional 
program  in  changing  the  attitudes  cf  female  college  students  toward 
physically  disabled  persons.  Dissertation  Abstracts  International/  32/ 
1931-1932A. 

Yuker/  Harold  E.  (1983).  The  lack  of  a  stable  order  of  preference  for 
disabilities:  Response  to  Richardson  and  Ronald.  Rehabilitation 
Psychology/  28(2)/  93-103. 

Yuker/  Harold  E.  (1986/  June).  The  effects  of  contact  on  attitudes  toward 
disabled  persons:  Some  empirical  generalizations.  Abstracts  of  papers 
presented  at  the  Attitudes  Toward  Persons  With  DisabilitJas  Conference/ 
Hofstra  University/  Hempstead/  New  York. 

Yuker/  Harold  E./  &  Block/  J.  R.  (1986).  Research  with  the  attitudes 
towards  disabled  pers^is  scales  (ATDP):  1960-1985.  Hempstead/  NY:  Center 
for  the  Study  of  Attitudes  Toward  Persons  With  Disabilities/  Hofstra 
University. 


ERIC 


o4  J 


Yuker,  Harold  E.,  Block,  J.  R.,  &  Younng,  Janet  (1970).  The  measuremetit 
of  attitudes  toward  disablec'  persons,  Albertson,  NY:  Human  Resources 
Center!    JW044  853) 

Zimbardo,  Philip  G.,  Ebbesen,  Ebbe  B.,  &  Maslach,  Christina.  (1977). 
Influencing  attitudes  and  changing  behavior  (2nd  ed.).  Reading,  MA: 
Addison-Wesley. 

Zukerman,  R.  (1975).  Changes  in  knowledge  and  attitudes  as  a  result  of 
participation  in  a  teacher  education  game  on  the  labeling  of  handicapped 
children.     Dissertation  Abstracts  International,  36,  6031-6032A. 


APPENDIX  A 

%  STATEMENT  OF  GENERAL  PURPOSE  AND  POPULATIONS 


• 


ERIC 


315 


345 


ERIC 


META-ANALYSIS:    MODIFYING  ATTITUDES  TOWARD  DISABLED  PERSONS 
STATEMENT  oF  GENERAL  PURPOSE  AND  POPULATIONS 

The  purpose  of  the  meta-analysis  project  is  to  conduct  a  comprehensive/ 
systematic  review  of  the  literature  to  determine  what  is  known  about 
modifying  attitudes  toward  disabled  persons.  One  objective  is  to  identify 
research  questions  on  which  further  investigation  is  especially  needed/  as 
well  as  those  that  do  nc**  merit  further  research*  A  seconda^*^''  purpose  is  to 
analyze  the  quality  of  the  research/  both  to  determine  if  offect  sizes  covary 
with  quality  and  because  the  quality  of  research  is  of  interesc  in  icself. 

The  population  of  studies  is  all  English-language  research  reports  of 
research  to  modify  attitudes  toward  disabled  persons.  In  initial  screening/ 
titles  or  references  to  studies  will  be  examined  for  words  that  indicate  that 
intent  for  the  research.  In  screening  actual  reports  for  inclusion/  only 
those  reports  that  contain  a  clear  statement  of  intent  to  investigate  (1)  the 
"modification  of  attitudes"  (2)  toward  "persons  with  disabilities"  (whether 
specifically  disabilities  or  in  general)  will  be  included.  Included  will  be 
journal  articles/  master's  theses/  doctoral  dissertations/  and  other 
unpublished  papers  which  are  identifiable  through  conventional  computer  and 
hand  search  techniques  and  the  bibliographies  of  reports  that  are  read.  The 
primary  emphasis/  therefore/  will  be  on  research  conducted  in  North  America/ 
particularly  the  USA;  but  research  from  other  countries  will  not  be  excluded. 

The  research  of  interest  is  that  carried  out  to  investigate  empirically 
the  effects  of  interventions/  or  treatments/  on  the  attitudes  of  nondisabled 
persons  toward  disabled  persons.  Correlational  research  will  not  be 
included.    For  example/  contact  with  disabled  persons  is  one  common  type  of 
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intervention.    Howeve*^         a  researcher  went  into  a  school  district  and 
obtained  information  to  categorize  students  according  to  the  amount  of  their 
contact  with  disabled  persons  during  the  prior  year  and  then  compared 
attitude  means  for  contact-hour  groups  or  correlated  amount  of  contact  with 
scores  on  an  attitude  toward  disabled  measure,   that  study  would  not  qualify. 
Experimental  and  quasi-experimental  studies  are  of  particul=»r  interest.  Pre- 
experimental  single-group  studies  that  involved  a  planned  intervention  and 
pretest-posttest  datc>  will  be  also  included,  as  will  static  group  designs 
used  to  investigate  procrram  effects.    Mainstreaming  studies  will  be  included 
if  the  effects  on  attitudes  toward  disabled  persons  have  been  investigated, 
although  much  of  this  research  has  involved  the  use  of  pre-experi mental 
designs. 

Research  reports  of  the  effects  of  interventions  on  the  attitudes  of  any 
age  or  occupational  nroup  are  of  interest  as  long  as  the  research  was 
directed  toward  changing  attitudes  toward  disabled  or  handicapped  persons. 

"Disabled  or  handicapped  persons"  is  defined  in  terms  of  conventional 
special  education  categories:  mentally  retarded,  hard  of  hearing,  claaf, 
speech  impaired,  visually  handicapped,  seriously  emotionally  disturbed  (or 
mentally  ill),  orthopedically  impaired,  deaf-blind,  multi-handicapped,  and 
learning  disabilities,  as  well  as  general  categories  such  as  "the  disabled" 
or  "the  handicapped".  Studies  of  subjects  from  populations  such  as 
"disruptive"  students  or  "slow  learners"  are  not  to  be  included. 

Attitudes  toward  disabled  persons  is  the  dependent  variable  of  interest. 
Attitudes  are  considered  to  have  cognitive,  affective,  and  behavioral 
components  and  may  be  assessed  in  a  variety  of  ways,  including  paper-and- 
pencil  tests  which  have  items  that   are  cognitive-affective  mixtures, 
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assessments  of  changes  in  voluntary  interact 'ons  with  disabled  persons/  or 
reactions  on  projectivo-typc*  tests.  Measures  which  assess  only  knowledge 
about  the  disabled  do  n6t  qualify  for  this  meta-analysis  unless  the  research 
report  authors  consider  chem  to  be  attitude  assessments/  nor  do  measures 
which  assess  attitudes  toward  mainstreaming.  General  measures  of  attitudes 
toward  children  or  other  people  will  not  be  included  unless  specifically 
aimed  at  disabled  persons  or  a  particular  type  of  disability/  through 
instructions  to  the  Ss  or  because  of  the  content  of  the  study — e.g./  an 
attempt  to  change  parents'  attitudes  toward  their  disabled  children. 

Measures  such  as  sociometric  scales/  friendship  choices/  or  positive 
interactions  are  relevant  only  i^f  the  researcher(s)  consider  them  to  be 
assessments  of  attitudes.  Observational  or  other  data  gathered  and/or 
reported  so  that  the  behaviors  or  responses  of  nondisabled  Ss  toward  disabled 
persons  or  the  direction  of  behavioral  or  response  change  cannot  be 
identified  will  not  be  included/  even  if  considered  in  the  report  to  be 
attitude  assessments. 

The  incent  is  to  obtain  effect  sizes  to  be  analyzed.  However/  studies 
which  are  experimental  or  quasi-experimental  in  nature  for  which  effect  sizes 
cannot  be  obtained  will  be  coded  for  other  information  if  time  is  available, 
"hose  sbiies  will  be  analyzed  separately. 
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APPENDIX  B 
CODING  INSTRUMENTS 


(1)  Coding  Instrument 

(2)  Prior  Contact  Coding  Sheet 

(3)  Contact  Coding  Sheet 

(4)  ES  Information  Missing  Coding  Instrument 

(5)  Supplementary  Sheets 

(a)  Effect  Sizes 

(b)  Information  Request 

(c)  Comments  on  Study 

(d)  Effect  Size  Computations 

(e)  Comments  on  Conventions 

(f)  Report  Disposition 
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Report  ID# 
Time  start 
time  end  _ 

Instrument 


of 


Coder 


Date 


META-ANALYSIS:    MODIFYING  ATTITUDES 
TOWAEID  DISABLED  PERSONS 

CODING  INSTRUMENT 


(Author(s)/Year) 


(Abbreviated  Title  &  Source) 


Checklist 


1.  Citation  checked 

2.  References  checked 

3.  Every  space  marked 

4.  ES  data  available 

5.  ES's  computed 


6.  ES's  checked 

7.  Comments  on  Conventions  Sheet 

8.  Comments  on  Study  Sheet 

9.  Scoring  log  completed 

10.  Report  disposition  sheet  completed 


A.    General  Information 


1-2 
$ 

3-6 
7-8 
9-10 
11 


ES  

ES  

ES__ 

ES  

*1.  Project  code 

M  A 

M  A 

M  A 

M  A 

*2.  Report  ID  # 

0  1 

0  1 

0  1 

0  1 

*3.  Card  # 

*4.  Year  of  publication 

*5.  Type  of  report  (1= journal;  2=book  chapter;  3=book; 
4=dissertation;  5=thesis;  6=convention  paper; 
7=unpublished  report;  8=other:  Explain 

;  9=combination /  specify  ) 

ES# 

ES# 

ES# 
I 

ES# 


Description  of  ES  Comparison (s) 
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0 


-2- 


ES 


ES 


ES 


ES 


a.  N  of  ES's 


b.  ID  of  ES 


Level  (l=primary;  2=secondary) 


d.  Type  of  comparison  (l=treatment  vs. 
control;  2=treatment  vs.  placebo; 
3=treatment  A  vs.  B;  4=pre-post; 
Interaction^  treatment  by; 
5=gender/  6=age7grade7  7=te3ting/ 

S=personality/  specify   

9=other/  specify   


Treatment  within;  10=gender/ 
female;  il=gender/  male;  12=other# 

specify   ;  Prior 

contact;  13=within  treatment;  14=across^ 
groups;  15=interaction) 


*7.  Target  Population  (0=not  mentioned;  l=term 
used;  2=defined;  3=population  described; 
4=1&2;  5=1&3)  I 


*8.  Accessible  Population  (0=not  mentioned; 

l=term  used;  2=defined;  3=described;  4=1&2; 
5=1  &  3;  6=1/  2,  &  3) 


9.  Replication  (0=No;  l=direct;  2=systematic; 
3=pseudo/  within) 

a.  of  other  research 


b.  within  study 


B.    Description  of  Sample(s) 

1.  N; 

a.  of  total  sample 


b.  of  experimental  Ss 


c.  of  control  Ss 
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BS 


ES 


ES 


ES 


36 
37 


2.  Sample  Selection  (Ocan't  tell;  l=random; 

2=solicited/volunteer J  3=captive/ intact  group; 
4=random/  group;  5=other/  specify:   ) 

Experimentcil 
Control 


3.  %  Male 


38-40 
41-43 
^44-46 


Total  Sample 
Experimental 
Control 


47 


4.  Treatment  Context  (0=can't  tell;  l=elementary  or 
secondary  schooling;  2=college/university 
education;  3=adult  education;  4=inservice;  5= 
work;  6=community;  7=recreation;  8=other/  specify 


) 


48-49 
5051 


5.  Educational  Level  of  Ss  (0=can't  tell;  l=preschool; 
2=primary;  3=intermediate;  4=middle  school;  5= 
junior  high  school;  6=senior  high  school;  7=com- 

bination/  specify  * 

8=undergraduates;  9=graduate  students;  lO=post- 
professional;  ll=adults  not  in  school;  12=other/ 


) 


Experimental 
Control 


52-53 
54-55 


6.  University  Students  According  to  Major  (0=not 
applicable;  l=can» t  tell;  2=eieinentary  education; 
3=secondary  education;  4=elementary  &  secondary 
education;  5=education/  unspecified;  6=nursing; 
7=occupational  &  frfiysical  therapy;  8=philosophy; 
9=psychology;  lO=rehabilitation  counseling; 
ll=social  work;  12=sociology;  13=special  education; 
14=coninunicative  disorders;  15=medicine;  16=other/ 
specify  ) 

Experimental 
Control 
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ES 


ES 


ES 


56-57 
58-59 


ES 


7.  Occupation  of  Nonstudent  Ss  (O=not  applicable; 
l=can*t  tell;  2=child  care  workers;  3=community 
recreation  workers;  4=employees  in  institutions; 
5=regular  class  teachers,  elementary;  6=midd'''=»  ' 
or  junior high;  7=high  school;  8=school  admin- 
istrators; 9=special  class  teachers;  lO=vocational 
rehabilitation  counselors;  ll=parents;  12=police; 
13=medical;  14=general  public;  15=other,  specify 

) 


Experimental 
Control 


8.  Prior  Experiences  with  Disabled  fersons  (0=can't 
tell;  l=none;  2=as  parents;  3=as  siblings; 
4=classmates;  5=as  teachers;  6=in  school; 
7=as  co-workers;  8=a3  supervisors;  9=in  work 
setting;  I0=as  clients/patients;  ll=general, 
not  specific;  12=combination/  specify 
13=other,  specify   


) 


60-61 
62-63 


64 


Experimental 
Control 


9.  Country  of  subjects  (1=USA;  2=Canada;  3=Australia/ 
New  Zealand;  4=Europe,  specify  ; 
5=other/  specify   )^ 


65 


C.  Treatment/Intervention 

1.  Basis  (0=can't  tell;  l^theory,  explicit;  2=prior 
research/  no  citations;  3=prior  research,  few 
citations;  4=prior  research,  case  developed; 
5--practical  experience/insight;  6=other, 
specify    ) 


66 


2.  Attitude  Chr^iHge  Theory 

a.  Theory  tl=6-R/behavioral;  2=conditxoning; 
3=congruity/equilibrium:  4=social  judgment; 
5=functional;     6=combination,  specify 

) 


67 


b.  Relationship  to  treatment  (l=mentioned  but  not 
used;  2=4>rief  allusion;  3=explicit,  well  dev- 
eloped basis;  4=post  hoc  interpretation;  j 
5=implicit) .    If  ma inst reaming  study  with  no 
prior  interventTon  to  change  attitudes,  skip  to 
C.IO.,  and  X-out  Sections  C.3-9. 
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ES 


68-69 
70-71 


ES 


ES 


ES 


3. 


Setting  (0=can't  tell;  Irregular  classroom;  2= 
special  classroom;  J=home;  4=institution;  5=group 
home;  6=hospital;  7=dormitory;  8=playground; 
9=camp;  10=recreation  facility;  ll=laboratory; 
12=individual/small  group;  13=normal  life; 
14=other/  specify   ; 


15=combination/  specify 

Experimental 
Control 


) 


72-73 
74  "75 


76-77 
78-79 


4.  Treatment/Intervention  Technique(s)  (O=none/ 
control;  l=placebo;  2=information;  3=direct 
contact;  4==vicarious  experience;  5=positive  rein- 
forcer  ont;  6=persuasive  message;  7=persuasive 
messages/  contrast;  8=2&3;  9=2&4; 

10=other/  specify   ;  ll=systematic 

desensi  t  izat ion ) 

Experimental 
Control 


a.  Information 

(1)  Type  (O=none;  l=etiology;  2=characteristics; 
3=problems;  4=similarities  with  nondisabled; 
5=prostheses  and  special  equipment;  6=famous 
disabled  people;  7=legal  rights;  8=parentincj/ 
management;  9=S8lf;  10=social  relations; 

ll=other/  specify   ;  12= 

conbination/  =;oecify  ) 

Experimental 
Control 


(Env   of  Card  #1) 


1-2 

M  A 

M  A 

M  A 

M  A 

Project  code 

3-6 

Report  ID# 

7-8 
• 

0  2 

0  2 

0  2 

0  2 

Card  # 
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ES  

ES  

ES  

ES  

(2) Delivery   mode    (0=none;  l=lecture; 

2=discussion;  3=lecture/discussion; 
4=print;  5=panel  discussion-disabled; 
6=panel  discussion-nondi.sabled; 
7=speaker-disabled;  8=speakei:-nondisabled;  ^ 
9=video  or  films?  lO=pictures/photos/film- 
strips;    ll=case  study;  12=audio; 
13=simulations;  14=regular  course/  specify 
;  15=regular  proqram/  specify 

17=cor)nbination/  specify                             )  ^ 

9-10 
11-12 

Experimental 
Control 

b*  Direct  contact  (0=none;  l=as  companions;  2=as 

peer  tutors;  3=in  cooperative  learning  groups;  # 
4=as  classinates;  5=as  classmates/  behavior 
modified;  6=as  students/  behavior  modified; 
7=practice  teaching-special  classes;  8=class- 
room  observation;  9=supervised  playground 
activities;  10=in  recreation  programs;  ll=panel 
discussions;  12=yuest  speaker(s);  I3=visit  to  % 
institution/residential  facilities;  14=integrated 
comniunity  programs;  15=as  teacher/counselor; 
xo— do  uu— wvjs jvcl J./— ocner/  speciry  ; 
18=combination/  specify  ) 

13-14 
15-16 

Experimental  # 
Control 

c.  Vicarious  experience  (0=none;  l=role  play- 
contact;  2=role  play-disabled;  3=simulation; 
4=observation  of  role  play  or  simulation;  5= 
videotapes  or  films;  6=case  studies;  7=pictures/^ 
photos;  8=print  fiction /biography;  9=dolls/ 

ll=combination/  specify  ) 

17-18 
19-20 

Experimental 

Control  • 

^  rvjoiuAve  LeinLOLcemenu  vu— none;  x=coverc;  z— 
overt ) 

21 
22 

Experimental 

Control  • 

e.  Persuasive  message  (0=none;  l=video/film; 

2=audio;  3=print;  4=expert;   5=expert/  disabled; 
6=sel f-presentation;7=other/specify 

i      O  —  v^UlllU  J.  lid  L  / 

specify                                     )  ~ 

23 
24 

Experimental 
Control 
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25-26 


BS 


27 
28 


29 


30 
31 


32-3^ 
35-37 


38-40 
41-43 


44-48 
49-53 

#  id 

ERIC 


ES 


ES 


ES 


5.  Treatment/Intervention  to  change  nctitucJes 
Toward:  (l=disabled  in  general;  2=physically 
disabled  in  general;  3=T:etarded/  general; 
4=moderately  retarded;  5=severely  retarded; 
6=inentally  ill;  7=emotionally  disturbed; 
8=visually  impaired/blind;  9=hearing 
impaired/deaf;  lOleaming  disabled;  11= 
speech/language  impaired;  12=cerebral 
pal sied ;  13=epileptic ;  14=para/quadriplegic ; 
15=other  physically  impaired,  specify 

 ;  16=autistic;  17=health  im- 

paired;  18=mul tiply  handicapped;  19=other/ 

specify  ;  20=connbination/ 

specify   


) 


6.  Treatment/intervention  Conducted  By:  (0= 
can't  tell;  l=experimenter;  2=project 
assistants;  3=regular  staff;  4=combin- 

aticxi/  specify    ;  5=other/ 

specify   ;  6=not  applicable) 

Experimental 
Control 


7.  Length  of  Treatment/lntarvention 

a.  Information  available  (0=no;  l==yes) 


b.  Days/week 

Experimental 
Control 


c.  Minutes/day 

Experimental 
Control 


d.  NurrtDer  of  weeks  (If  less  than  1/ 
insert  x's) 

Experimental 
Control 
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e^  Total  #  of  hours 

Experimental 
Control 
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54 

55 


56 
57 


58 

59 


ES_ 

ES  

ES  

ES  

8.    Verification  of  Treatnoent  Implennentation  (O=none; 
l=not  necessary;  2=systematic  observation; 
3=nonstructured  observation;  4=interviews 
with  Ssj  5=quest:ionnaires  to  Ss;  6=intervenor 
fol low-up ;  7=ot her /  spec i  f y  ; 
8=contoination/  specify  ) 

Experimental 
Control 

— 

— 

— 

— 

— 

— 

— 

a*  Rerrorting  (Onot  applicable;  l=data;  2=data 
arid    analysis    presented;    3=assertion  by 
author(s) ) 

Experimental 
Control 

— 

— 

— 

b.  Degree  of  implementation  claimed  (O=not 
applicable;  l=none;  2=some;  3=mostly; 
4=complete) 

Experimental 
Control 

c.  Basis  for  author's  conclusion  re  imple- 
mentation (O=not  afplicable;  l=can't  tell; 
2==author's  judgment  or  inference;  3=statistical 
significance;  4=inspection  of  data;  5=corabina- 
tion/  specify                                ;  6=other/ 
specify  ) 

Experimental 
'^ontrol 

d.  Actual  implementation  (Incomplete;  2=mostly; 
3=only  in  part) 

e.  Description  of  treatment  adequate  for 
replication  (0=no;  l=somewhat;  2=yes) 

60 
61 


62 


63 
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ES 


BS 


ES 


ES 


9,  Treatment  validity 

a.  Implementation  ffromSd) 

b,  Hawthorne 
€•  John  Henry 

d.  Treatment  diffusion 

e.  Dissatisfaction/ resentment 


f.  Novelty/disruption 

g.  Experimenter 
effect/expectations 

h.  Treatment/experi- 
mOTter  confounded 

i.  Test  X  treatment 
interaction 


(0=can*t  tell;  l=:not  a 
plausible  threat;  2= 
minor  probleni?  3=sub- 
stantial  problem;  4= 
major  problem;  5=not 
applicable.    For  3/ 
4;  &  5/  write  reason 
next  to  item, ) 


j.  Multiple  treatment  interference 

k.  General  Treatment  Validity  (l=excellent ; 
2=fair;  3=poor) 


10.  Mainstreaming 

a.  Presence  (0=no  mainstreaming;  l=main~ 
streaming  only;  2=pretreatment  plus 
mainstreaming) 


b.  Type  of  study  (0=can^t  tell;  l=planned; 
2=post  hoc) 


c.  Instruction  in  mainstreamed  classes  (0=can't 
tell;  l=standard  group/class  instruction; 
2=cooperative  learning;  '  ndividualized 
instruction;  4=peer  tutc       :  5=contoination/ 

specify   ;  6=other/ 

specify   ) 


Special  personnel  support  (0=can*t  tell;  l=:none; 
2=teacher  preparation;  3=consultant  help;  4=^other/ 
specify   ) 


e.  Special  skills  training  for  disabled  students 
(0=can't  tell;  l=none;  2=social;  3=academic; 
4=both) 
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(End  of  Card  #2) 


ES 


M'  A 


0  3 


ES 


M  A 


0  3 


ES 


ES 


M  A 


0  3 


M  A 


Project  Code 


0  3 


Report  ID# 


Card  # 


f.  Type  of  special  skills  training.    (o=not  appli- 
cable;  l=coaching;  2=m)odeling;  3-^ounseling; 
4=direct  reinforcement;  5=^groi:^  contingencies;  ^ 
6=diagnostic/prescriptive;  7=cognitive  control;  * 

8=coiTbination/  specify  ; 

9=other/  specify  ) 


g.  Special  instruction  for  nondisabled  peers 

(0=can't  tell;  l=none;  2=infonnation;  3=vicarioi^ 
experience;  4=reinforcement;  5=persuasive 
messages) 


h.  Parent  education  (0=can't  tell;  l=none;  2=disablec 
children;  3=nondisabled  children;  4=2&3) 


i.  Type  of  parent  education  (0=can't  tell;  l=not 
applicable;  2=information;  3=vicarious  exper- 
ience; 4=n:einforcement;  5=persuasive  messages) 


j.  Disabled  children  in  mainstreamed  classes 

(0=can't  tell;  l=mildly  and  moderately  retarded 
2=eiTJotionally  disturbed;  3=visually  impaired; 
4=blind;  5=hearing  impaired;  6=deaf;  7=communi- 
cation  disordered;  8=physically  and  health 
impaired;  9=learning  disabled;  lO=combination/ 
specify  ;  ll=others/  specify  

)  ^ 


Number  of  minutes  handicapped  children  spend 
daily  ^n  mainstreamed  class 


1.  Numbe     /f  days  per  week  in  mainstreamed  class 


Total  minutes  per  week  in  mainstr  amed  class 


n*  Months  m  mainstreamed  program  when  outcomes 
assessed 


o.  Outcoa  ^  measured  for  (l=nondisabled  peers;  2=  < 
teachets;  3=parentS/  disabled  children;  4=- 
parents,   nondisabled  children;  5=administrators; 
6=combination/  specify      ) 
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ES 


ES 


ES 


ES 


27 


||28-29 


D.    Dependent  Mecisures 

*1.  "Attitude"  defined  (0=no;  l=affective;  2=cognitive; 
3=behaviors;  4=1&2;  5=1&3;  6=1,  2,  &  3) 


*2.  Number  of  dependenc  measures.    List  (by  name  of 
instrument  and  form/  if  possible/  or  general 
category) : 

1.    4.   

2.    5.  

3.  6. 


30 


3.  Use  of  common  instrument  (0=no;  1=ATDP;  2=0MIj 
3=RGEPS;  4=ATHI;  5=ATBS;  6=ME^I;  7=MTAI, 
revised;  8=combination/  specify  ) 


31 


4.  Common  instrument  modified  (O=not  applicable;  l=no; 
2=for  different  population  of  Ss;  3=for  different 
disabilities;  4=other/  specify   ) 


#32 


5.  Source  of  data  (1  =self- report ;  2=jpinion  of 

other/  teacher;  3=opinion  of  otbr.c,  administrator; 

4==opinion  of  other/  specify  ; 

5=observation;  6==nonproject  request) 


•>33-34 


6.  Type  of  assessment  (l=interview/  structured; 
2=interview/  nonstructured;  3=atti*:ude  question- 
naire; 4=sociometric  measure;  5=peer  assessment; 
6=social  distance  scale;  7=informal  observation; 
8=systematic  observation;  9=semantic  differen- 
tial; 10=telephone  or  mail  survey;  ll=telephone 
or  mail  request;  12=Q-sort;  13=projective  test, 
pictures;  14=sentence  completion;  15=adjective 
checklist;  16=rankings;  17=other/  specify 

) 


^35 


7.  Source  of  instrument  (O=not:  applicable;  l=can't 
tell;  2=teacher-made;  3=project  developed;  4=prior 
research;  5=other/  specify  ) 
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ES 


ES 


ES 


ES 


S    Development  by  project  (O=not  applicable;  l=des- 
cription  not  provided;  2=description  not  adequate; 
3=adequate  description) 


9.  Reliability  of  scores 

a.  Mentioned?    (0=no;  l=coef ficient  reported;  2= 
yes/  but  no  coefficient) 


b.  Source  (O=not  applicable;  l=can't  tell;  2=com- 
puted  on  sample;  3=reported  from  other  resiearch; 
4=pilot  study;  5=combination/  specify  ) 


c.  Method  (O=not  applicable;  l=can't  tell; 
2=test-retest;  3=intemal  consistency; 
4=alternate  forms;  5=inter-observer — %; 
6=inter-observer — r;  7=intra-observer — %; 
8=intra-observer — r;  9=categorization 
reliability;  10=cor±)ination/  specify 
  ) 


d.  Coefficient 


e.  Magnitude  (O=not  applicable;  1=. 80-1.0; 
2=.60-.79;  3=.0-.59) 


10.  Validity  of  scores 

a.  Discussed  (0=no;  Immoderately;  2=comprehensively) 


b.  Type  (O=not  applicable;  l=general;  2=face; 
3=construct-discrimination  or  correlations; 
4=construct — expert  judgment;  5=concurrent; 
6=combination/  specify   ) 


c.  Source  (Omno."*  applicable;  l=not  mentioned; 
2=general  reference  to  literature;  3=citation 
of  research;  4=project  data;  5=inference  without 
data;  6=combination/  specify  ) 


d.  Reactivity  of  measure  (l=low;  2=:moderate; 
3=high) 


e.  Adequacy  of  validity  (l=low;  2=moderate; 
3=high) 
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E!S  

ES_ 

ES  

11.  Data  collection 

49 
• 

- 

a.  Type  fO=can't  tell;  l=regular  staff, 

2=resoarcher;  3=research  assistant;  4=noncontact 
5=other/  specify  ) 

50 

b.  Blinded  collection  (0=can't  tell;  l=not 

afolicable;  2=no;  3=partial — experiinental  or 
control/  or  pre  or  post  only;  4=Yes) 

%1 

12,  Blinded  scoring  (0=can't  tell;  l=not  applicable; 
2=no;  3=partial — experimental  or  control/  or 
pre  or  posttest  only;  4=yes) 

52 

m 

13.  Time  of  posttest  (0=can't  tell;  l=iinmediate; 
2=de2ayed;  3=fol low-up) 

53-56 

• 

* 

• 

* 

14,  Weeks  after  intervention  to  posttest 

• 

E.    Internal  Validity 

57 

1,  Design  (l=pre-post/  control;  2=posttest-only; 
3=Solomon  4-group;  4=nonequivalent  control  group; 
5=single  subject;  6=prr'post/  one-group;  7=Swatic 
groip;  8=other/  specify  ) 

•58 

• 

— 

— 

— 

2.  Assignment  to  groips  (0=can't  tell;  l=random; 
2=match-random;  3=select  or  match  from  different 
group;  4=random  assignment  of  intact  groups; 
5=convenience;  6=noL  applicable;  7=other/ 
speciiy  ; 

^«  iriLeaus 

59 

a.  Treatment  Validity  (from  C.9.k./  p*  9) 

60 

61 

62 

63 
• 

64 

65 

— 

— 

— 

b.  Maturation 

c.  History 

(0=can't  tell;  1= 

d.  Testing                   not  a  plausible 

threat;  2=minor 

e.  Instrumentation       threat;  3=substan- 

tiaL  threat;  4=major 

f.  Statistical              problem.    For  3 
regression               &  4,  write  reason 

next  to  item.) 

g.  oeiecLion 

me 

h.  Experimental  mortalitv 

67  ' 

i.  General  Internal  Validity  {l=high; 
2=medium;  3=1 ow) 
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ES  

ES  

ES  

ES  

P.  Results  • 

68 

1. 

Statistical  significance  (O=not 
available;  l=not  significant  at  .05 
level;  2=significant  at  the  .05  level) 

2. 

Author's  conclusions  about  effectiveness 

69 

— 

— 

a.  Qualified?  (0=no;  l=sample;  2=interac- 
tions;  3=design;  4=nieasures;  5=treat- 
ment  verification;  6=replication; 
7=ot her /  speci  f y                             #  ^ 
8=combina t ion /  spec i fy  ) 

70 

— 

— 

b.  Treatment  effective?    (O=none;  1= 

didn't  have  effect;  2=data  equivocal^ 
3=prcx3uced  effect;  4=produced  negati\^ 
effect) 

3. 

Effect  size(s) 

71 

a.  Available  (0=no;  l=yes,  positive  ^ 
change;  2=^es/  negative  change;  3= 
no/  negative;  4=no/  positive) 

72- 

77 

• 

• 

• 

b.  D.  ^ 

1)  +  D 

(End  of  Card  #3) 
*********** 

* 

* 

***** 

****** 

*  * 

******************  ^ 

1- 

2 

M  A 

M  A 

M  A 

M  A 

Project  code 

3- 
6 



Report  ID# 

7- 
8 

0  4 

0 

4 

0  4 

0  4 

Card  # 

9- 
10 

2)  Source  (O=not  aji^licable;  l=re- 

ported;  2=calculated;  3=t  or  ANOVSi^ 
F;  4=correlated  t;  5=correlated  t 
(.50);  6=n-way  ANOVA;  7=CDVAR; 
8=0DVAR  (.50);  9=proportions,  chi- 
squar*?;  10=other  nonparametric; 
ll=rpj3;  12=significance  level; 
13=oEher /  speci  fy                       )  9 

O 

ERLC 
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BS 


11 


ES 


ES 


ES 


3)  Scale  of  X  difference  (O=not  af^li- 
cable;  l=raw  post;  2=raw  gain;  3= 
covariance  adjusted;  4=residual  gain 


12- 
13 


14- 

19 

20- 
25 


26 


27- 
29 


4)  Standard  deviation  (O=not  appli- 
cable; l=post  control/placebo; 
2=pretest;  3=1&2  pooled;  4=l-way 
ANOVA  SSs;  5=n-way  ANOVA  SSs; 
6=l-way  ANCDVA  SSs;  7=n-way  ANCOVA 
SSs;  8=adjusted  CXDVAR  sd;  9=adjustec 
gain  sd;  lOpooled  post) 


5)  Mdn  primary  D  for  type  of  assessment 


6)  Mdn  primary  D,  overall 


c.  Correlation 


1)  Type  (O=none;  l=rph;  2=E^;  3=Phi; 

4=Cramer's  V;  5=otner/  specify   ) 


2)  +  coefficient 


30 


3)  Source  (O=not  applicable;  l=reported; 
2=calculated;  3=estimated  from  D) 


31- 
33 


4^  Mdn  primary  coefficient  for  type  of 
assessment 


42- 
46 


5)  Mdn  primary  coefficient/  overall 


d.  Variance  ratio 
1)  Ratio  (s|/S^) 


2)  Mdn  primary  variance  ratio/  type 
of  assessment 


3)  Mdn  primary  variance  ratio/  overall 
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52 


53 


54 


55- 
60 

61 


ES  

ES  

ES  

ES  

G.  Supplemental  Information 

1.  Information  gain 

a.  Reported?  (O=not  applicable;  l=yes; 
2=no) 

— 

— 

— 

— 

b.  Type  of  report  (Onot  applicable;  1= 
verbal;  2=statistical  significance; 
3=descriptive  statistics;  4=2&3) 

— 

— 

c.  Conclusions  (O=not  applicable; 
l=clear  gain;  2=no  gain;  3=mixed 
results;  4=can*t  tell/inconclusive) 

• 

• 

• 

9 

d.  Effect  size — D 

2.  Type  of  study  (0=can't  tell;  l=course 
evaluation;  2=program  evaluation; 
3=experimental  treatment;  4=main- 
streaming/  variation  in  time) 

H.  Coding  Suiranary 
1.  Minutes  spent  coding 

2.  Coder  (l=Curtis;  2=Jesunathadas; 
3=Shaver;  4=Strong) 

62- 
64 

65 
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Report  ID# 


Author(s)/Date 


META-ANALYSIS:    MODIFYING  ATTITUDES 
TOWARD  DISABLED  PERSONS 

^*    P^J-Q^^  Contact  Coding  Sheet 


ES 


ES 


ES 


M  A 


M  A 


M  A 


ES 


M  A 


*!•  Project  code 


2.  Report  ID 


0  5 


0  5 


0  5 


0  5 


3.  Card  # 


4.  Coder  (l=Curtis;  2=Jesunathadas;  3=Shaver; 
4=Strong ) 


5.  Prior  contact  assessed  (0=No;  l=yes;  2=intplicit) 


6.  Use — selection/assignment  (O=not  applicable; 
l=none;  2=describe  sample  only;  3=eligibility  as 
Ss — none;  4=eligibiiity  ac>  Ss — yes;  5=selection 
strata;  6=assignment  strata;  7=5&6;  8=covariate; 
9=other/   ;  10=confcination/  ) 


7o  Use — outcome  analysis/  correlational  (O=not 
applicable;  l=none,  2=within  treatment/  post; 
3=within  treatment/  change;  4=treatroent  vs. 
control/  post;  5=treatment  vs.  control/  chance) 


8.  ID  of  ES 


9.  N  of  added  secondary  ESs 


10.  Type  of  assessment  (0=Not  applicable;  l=can*t 
tell;  2=questionnaire;  3=interview;  4-"5<=^tting; 
5=other/  specify/  ;  6=2&3;  7=2&4;  8=3&4; 


9=2/3/&4;  10=combinationi 


) 


ERIC 


11.  Type  of  prior  contact  setting  (O=not  applicable; 
l=mainstreamed  classroom;  2=mainstreamed/  school; 
3=mainstreamed/  regular  teacher;  4=special 
education;  5=institution;  6=family;  7=other 
;  8=corrbi  nation/   _) 
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ES 


ES 


ES 


ES 


21-22 


12.  Definition  of  degree  of  contact  (O=not  applicable; 
l=can*t  tell;  2=airount;  3=nurober;  4=frequency; 
5=length;  6=intensity;  7=dichotomy;  8=type  of 
relationship;  9=institutional  visit;  10=other/ 
specify/   ;  ll=combination/   ) 


23 


13.  Quality  of  contact  (O=not  applicable;  l=can't  tell; 
2=no;  3^es) 


24 


14.  Direction  of  quality  comparisOT  (O=not  applicable; 
l=positive  vs.  neutral;  2=negative  vs.  neutral; 
3=positive  vs.  negative) 


25-26 


15.  Type  ot  disability  (O=not  afplicable;  l^disabled 
in  general;  2=physically  disabled  in  general; 
3=retarded/  general;  4=moderately  retarded; 
5=severely  retarded;  6=mentally  ill;  7=emotionally 
disturbed;  8=visually  impaired/blind;  9=hearing 
impaired/deaf;  10=learning  disabled;  ll=speech/ 
language  impaired;  12=cerebral  palsied; 

3pilep^ic;  14=para/quadriplegic;  15=other 

physically  impaired/  specify  j 

16=autistic;  17=health  impaired;  18=multiply 
handicapped;  19=other/  specify  ; 
20=combination/  specify  


27-29 


16.  Minutes  spent  coding 
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Coder  Date 


Time:  Start 
End  _ 

Sheet 


1«2 


3-6 


7-8 


e  9 


10 


11 


12 


13 


14-15 


16 


17 


18 


of 


Mm-ANALYSIS:    MODIFYING  ATTITUDES 
TOWAEU)  DISABLED  PERSONS 

J.    Contact  Coding  Sheet 


Report  ID# 


Author(s)/Date 


ES 


M  A 


0  6 


ES 


M  A 


ES 


M  A 


0  6 


0  6 


ES 


M  A 


0  6 


1.  Project  code 


2.  Report  ID 


3.  Card  # 


4.  Coder  ( 2=Jesunathadas;  3=Shaver) 


5.  Status 

a.  Age  (0=can't  tell;  l=disabled  younger;  2=same; 
3=disabled  older;  4-variety) 


b.  Educational- vocational  prestige  (0=can't  tell; 
l=disabled  lower;  2=saTne;  3=disabled  higher) 


c.  Helping  relationship  (O=none;  l=professional 
2=preprofessional;  3=nonprofessional;  4=mutual 
help;  5=disabled  the  helper) 


d.  Overall  (0=can't  tell;  l=disabled  lower; 
2=equal;  3=disabled  higher) 


6.  ID  of  ES 


7.  Type  of  contact 

a.  Voluntariness  (0=can't  tell;  l=assigned; 
2=role  choice;  3=voluntary;  4=varied) 


b.  Intimacy  (0=can"t  tell;  l=no  interaction; 
2=casual  personal  contact;  3=close  personal 
contact;  4=varied  contact;  5=potential  contact) 


c.  Cooperation-competition  (0=can't  tell;  l=no 
opportunity;  2=not  necessary;  3=implicit 
cooperation;  4=explicit  cooperation;  5=implicit 
competition;  6=explicit  competition;  7= 
combination  of  cooperation  and  competition) 
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ES 

ES 

ES 

ES 

19 

d.  Reinforcement  (0=can't  tell;  l=none;  2=shared/ 
intrinsic;  3=shared/  external;  4=nondisabled;  ^ 
5=2  &  4;  6=3  &  4) 

20 

e.  Pleasantness  (0=can't  tell;  l=no;  2^es) 

21 

• 

f.  Mcx3eling  (Ocan't  tell;  l=none;    2=by  peers; 
3=by  significant  others) 

22 

Institutional/authority/pi  2r  support  (0=can't  tell; 
l=no;  2=Yes)  # 

9. 

Characteristics  of  'disabled  persons 

23-24 

a.  Disability  (0=can't  tell;  l=combination  (specify 
;  see  p.  J-7  of  Conventions)  li 

25 

b.  Negative  stereotype  (0=can*t  tell;  l=7es;  2=no) 

26 

c.  Competence  (0=can't  tell;  l=lacked;  2=acknow-  0 
ledge/acceptance;  3=competent) 

10. 

Characteristics  of  nondisabled 

27 

a.  Personality  related  (O=not  assessed;  l=not  ^ 
tested;  2=no;  3=niixed  results;  4=yes) 

28 

b.  .^rior  attitudes  related  (O=not  assessed; 
l=aot  tested;  2=no;  3=mixed  results;  4=yes) 

29 

11. 

Treatment  (l=contact;  2=contact  &  information; 
3=contact  &  vicarious  experience) 

30-32 

12. 

Minutes  spent  coding  ^ 

342 

O  oC)[} 

ERIC 


Report  ID# 
Time  start 
Time  end  _ 
Instrument 


of 


Coder 


Date 


(Author(s)/Year) 


META^AI^ALYSIS:    MODIFYI^^G  AOTIIUDES 
TOWARD  DISABLED  PERSa.^IS 

a?uJNG  INSTRUMENT 
Eg  Ixiformation  Missing 

(Abbreviated  Title  &  Source) 


Checklist 


1.  Citation  checked 

2.  References  checked 

3.  Every  space  marked 
4*  ES  data  available 
5.  ES's  computed 


6.    ES*s  checked 

?•    Cotrments  on  Conventions  Sheet 

8.  Comments  on  Study  Sheet 

9.  Scoring  log  completed 

10.  Report  disposition  sheet  completed 


A.    General  Information 


ES  

ES__ 

ES  

ES  

*1.  Project  Dde 

M  A 

M  A 

M  A 

M  A 

*2.  Report  ID  # 

0  _1 

0  1 

0  1 

0  1 

*3.  Card  # 

*4.  Year  of  publication 

*5.  Type  of  report  (l=journal;  2=book  chapter;  3=book; 
4=dissertation ;  5=thesis;  6=convention  paper; 
7=unpublished  report;  8=other:  Explain 

) 

6.  Effect  Size(s) 
Description  of  ES  Comparison(s) 


ES# 
ES# 
ES# 
ES# 


-2- 


12-13 


a,  N  of  ES's 


b,  ID  of  ES 


c.  Level  (l=primary;  2=secondary) 


d.  Type  of  comparison  (l=treatment  vs. 
control;  2=treatment  vs.  placebo; 
3=treatment  A  vs.  B;  4=pre«post; 
Interaction    treatment  by; 
5=yender,  6=age7grade7  7=testing, 

8=f^rsonalityf  specify   

9=other,  specify 


Treatment  within;  10=gender, 
female;  ll=gender,  male;  ?.2=other, 
specify  ) 


*7.  Target  Population  (Onot  mentioned;  l=term 
used;  2=de fined;  3=population  described; 
4=1&2;  5=1&3) 


*8.  Accessible  Population  (Onot  mentioned; 

l=term  used;  2-defined;  3=described;  4=1&2; 
5=1  &  3;  6=1,  2,  &  3) 


9.  Replication  (ONo;  l=direct;  2=systematic) 
a.  of  other  research 


b.  within  study 


23«27 


28-31 


B.    Description  of  Sample(s) 

1.  N; 

a.  of  total  sample 


b.  of  experimental  Ss 


c.  of  control  Ss 


ERIC 
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344 
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ES_ 

ES  

ES_ 

ES  

• 

2.  Sample  Selection  (0=can't  tell;  l=random; 

2=solici ted/volunteer;  3=captive./intact  group; 
4=random/  group;  5=other,  specify:  j 

36 
37 

— 

— 

— 

— 

ExDeriitiental 
Control 

• 

3.  %  Male 

38-40 
41-43 
44-46 

• 

Total  Sample 
Exp6rin'<ental 
Control 

47 

• 

4.  Treatment  Context  (0=can't  te]l;  l=elementary  or 
secondary  schooling;  2=colleg<:./university 
education;  3=adult  education;  4=inservice;  5= 
work;  6=community;  7=recreation;  8=other/  specify 

) 

5.  Educational  Level  of  Ss  (0=can't  tell;  l=preschool; 
2=primary;  3-intermediate;  4=middle  school;  5= 
junior  high  school;  6=senior  high  school;  7=com- 
bination/  specify  ; 
8=undergraduates;  9=graduate  students;  10=post- 
professional;  ll=a'^ults  not  in  school;  12=other/ 

) 

^  48-49 
50-51 

Experimental 
Control 

• 

6.  University  Students  According  to  Major  (0=not 
applicable;  l=can't  tell;  2=elementary  education; 
3=secondary  education;  4=elementary  &  secondary 
education;  5=education/  unspecified;  6=nursing; 
7=occupatione''  &  physical  therapy;  8=philosophy; 
9-psych  ^logy;  10=rehabilitaticn  counseling; 
ll=social  work;  12=sociology;  ll=special  education; 
14=comniunicative  disorders;  15=medicine;  16=other/ 
specify  ) 

52-53 
54-55 

Experimental 
Control 

ERIC 


345 


072 


56-57 
58-59 


60-61 
62-63 


64 


ES 


ES 


ICS 


ES 


Occupation  of  Nonstudent  Ss  (O=not  applicable; 
l=can't  tell;  2=child  care  workers;  3=comniunity 
recreation  workers;  4=employees  in  institutions;  ^ 
5=regular  class  teachers,  elementary;  6=middle 
or  juniorhigh;  7=high  school;  8=sch6ol  admin- 
istrators; 9=special  class  teachers;  lO=vocational 
rehabilitation  counselors;  ll=parents;  12=police; 
13=iiiedical;  14=general  public;  15=other,  specify 

 )  • 

Experimental 
Control 


Prior  Experiences  with  Disabled  Persons  (0=can't 
tell;  l=none;  2=as  parents;  3=as  siblings; 
4=classmates;  5=as  teachers;  6=in  school; 
7=as  co-workers;  8=as  supervisors;  9=in  work 
setting;  lC=as  clients/patients;  ll=general/ 

not  specific;  12=coinbination/  specify   ; 

13=other,  specify   

Experimental 
Control 


9.  Country  of  subjects  (l-^^lSk;  2=Canada;  3=Australia/ 

New  Zealand;  4=Eurcpe,  specify   

5=other,  specify   )^ 


65 


C.  Treatment/Intervention 

1.  Basis  (0=can»t  tell;  l=theory,  explicit;  2=prior  ^ 
research,  no  citations;  3=prior  research/  few 
citations;  4=prior  research,  case  developed; 
5=practical  experience/insight;  6=other, 
specify    ) 


66 


2.  Attitude  Change  Theory 

a.  Theory  (l=S-R/behaviora'  .nditioning; 
3=conaruity/equilibrium;  4=£.^cial  judgment; 
5=fuiictional;     6=combination,  specify 

) 


67 


Relationship  to  treatment  (l=mentioned  but  not 
used;  2=brief  allusion;  3=explicit,  well  dev- 
elop-3d  basis;  4=post  hoc  interpretation; 
5=implicit)*    It  mainstreaming  study  with  no  0 
prior  interventTon  to  change  attitudes,  skip  to 
C.lO. /  and  X-out  Sections  C.3-9c 


ERLC 


373 


?>46 


68-69 
70-71 


ES 


ES 


ES 


ES 


-5- 


3.  Setting  (0=can*t  tell;  l=regular  classroom;  2= 
special  classroom;  3=hoine;  4=institution:  S-=group 
home;  6=hospital;  7=dormitory;  8=playground; 
9=camp;  10~recreation  facility;  ll=laboratory ; 
12=individual/sinall  group;  13=norinal  life; 

14=other/  specify   ^  ; 

15=corrbination/  specify   ) 

Experimental 
Control 


72-73 
74-75 


Treatment/Intervention  Technique(s)  (Onone/ 
control;  l=placebo;  2=information^  3=direct 
contact;  4=vicarious  experience;  5=positive  rein- 
forcement; 6=persuasive  message;  7=persuasive 
messages,  contrast;  8=2&3;  9=2&4; 
10=other,  specify     ) 

L^perirnental 
Control 


76-77 
78-79 


(End  of  Card  #1) 


a.  Information 

(1)  Type  (Onone;  l=etiology;  2=characteristics; 
3=problems;  4=similarities  with  nondisabled; 
5=prostheses  and  special  equipment;  6=famous 
disabled  people;  7=legal  rights;  8=parenting/ 
management;  9=self;  lOsocial  relations? 

ll=other,  specify   _;  12= 

coirbination/  specify  ^  ) 

Experimental 
Control 


1-2 


M  k 


M  A 


M  A 


M  A 


Project  code 


3-6 


Report  ID# 


7-8 


C  2 


0  2 


0  2 


0  2 


Card  # 


ERLC 


347 

'  '  374 


9-10 
11-12 
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(2)  Delivery   mode   (O=none;  l=lecture; 
2=discussion;  3=lecture/discussion; 
4=print;  5=panel  discussion-disabled; 
6=panel  discussion-nondisabled; 
7=speaker-disabled ;  8=speaker-nondisabled ; 
9=video  or  films;  10=pictures/photos/film- 
strips;    11-case  study;  12=audio; 
13=simulations;  l'5=regular  cxDurse,  specify 

  ;  IS'-^regular  program/  specify 

16=other/  specify   


17=combination/  specify 

Experimental 
Control 


T 


b.  Direct  contact  (0=none;  l=as  cx)nipanions;  2=as 
peer  tutors;  3=in  cooperative  learning  groups; 
4=as  classmates;  5=as  classmates,  behavior  ^ 
modified;  6=as  students,  behavior  modified; 
7=practice  teaching-special  classes;  8=class- 
room  observation;  9=supervised  playground 
activities;  I0=in  recreation  programs;  ll=panel 
discussions;  12=guest  speaker(s);  13=visit  to 
institution/residential  facilities;  14=integrated< 
community  programs    15=as  teacher/ counselor; 

16=as  co-workers;  l/=other/  specify  

18=combination/  specify   


) 


Experimental 
Ctontrol 


c  Vicarious  experierice  (O=none;  l=role  play- 
contact;  2=role  play-disabled;  3=simulation; 
4=observation  of  role  play  or  simulation;  5= 
videotapes  or  films;  6=case  studies;  7=pictures/ 
photos;  8=print  fiction/biography;  9=dolls/ 

puppets  lO=other/  specify    ; 

ll=combination/  specify   


) 


Experimental 
Control 


d.  Positive  reinforcement  (O=none;  l=covert;  2= 
overt) 

Experimental 
Control 


e.  Persuasive  message  {O=none;  l=video/film; 

2= audio;  3=print;  4=expert;  5=expert/  disabled; 
6=self-presentation;7=other/specify 
  ;  8=combinatior./ 


specify   

Experimental 
Control 


) 
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ES 


ES 


25-26 


27 
28 


29 


ES 


ES 


5.  Treatment/Intervention  to  Change  Attitudes 
Toward:  (l=disabled  in  general;  2=physically 
disabled  in  general;  3=retarded/  general; 
4=inoderately  retarded;  5=severely  retarded; 
6=men tally  ill;  7=eniotionally  disturbed; 
8=visually  impaired/blind;  9=hearing 
impaired/deaf;  10=learning  disabled;  11= 
speech/ language  impaired;  12=cerebral 
palsied;  13=epileptic;  14=para/quadriplegic; 
15=other  physically  impaired,  specify 
 ;  16=autistic;  17=health  im- 
paired; 18=multiply  handicapped;  19=other/ 

specify  ;  20=coirbination/ 

specify  ) 


6.  Treatment/Intervention  Conducted  By:  (O 
can't  tell;  l=experimenter;  2=project 
assistants;  3=regular  staff;  4i=coinbin- 

ation,  specify   :  5=other/ 

specify   ;  &=rc*:  applicable) 

Experimental 
Control 


7.  Length  of  Treatment/Intervention 

a.  Informaticxi  available  (0=no;  l=yes) 


30 
31 


b.  Days/week 

Experimental 
Control 


32--34 
35-37 


c.  Minutes/day 

Experimental 
Control 


38-40 
41-43 


44-48 
49-53 


d.  Number  of  weeks  (If  less  than  1/ 
insert  x's) 

Experimental 
Control 


e.  Total  #  of  hours 

Experimental 
Control 


'erlc 
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ES 


54 
55 


ES 


ES 


ES 


Verification  of  Treatment  Implementation  (O=none; 
l=not  necessary;  2=systematic  observation; 
3=nonstructured  observation;  4=interviews 
with  Ss;  5=questionnaires  to  Ss;  6=intervenor 

follow-up;  7=otner,  specify   

8=combination/  specify   


T 


Experimental 
Control 


56 
57 


58 
59 


Reporting  (O=not  applicable;  l=data;  2=data 
and    analysis    presented;    3=assertion  by 
author(s) ) 

Experimental 
Control 


b.  Degree  of  implementation  claimed  (O=not 
applicable;  l=none;  2=some;  3=nx)stly; 
4=complete. 

Experimental 
Control 


60 
61 


62 


63 


Basis  for  author's  conclusion  re  imple- 
mentation (0=nct  applicable;  l=can't  tell; 
2=author's  judgment  or  inference;  3=statistical 
significance;  4=inspection  of  data;  5=combina- 

tion,  specify   ;  6=other/  ^ 

specify   

i3xperi  mental 
Control 


) 


Actual  implementation  {l=complete;  2=mostly; 
3=only  in  part) 


e,  Description  of  treatment  adequate  xor 
replical'ion  (0=no;  l=somewhat;  2=ye/3) 


ERIC 


^77 
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9.  Treatment  validity 

a.  Implementation  (from  8d) 

b.  Hawthorne 

c.  John  Henry 

d.  Treatment  diffusion 

e.  Dissatisfaction/resentment 


Co=can't  tell;  l=not  a 
plausible  threat;  2= 
minor  problem;  3=sub- 


f.  Novelty/disruption 

g.  Exp'^rimenter 
effect/expectations     stantial  problem;  4= 

rajor  problem;  5=not 

h.  Treatment/experi-        applicable.    For  3/ 
menter  confounded        4/  &  5/  write  reason 

next  to  item. ) 

i.  Test  X  treatment 
inte)rac-ion 

j.  Multiple  treatment  interference 

k.  General  Treatment  Validity  (l=excellent; 
2=fair;  3=poor) 


10.  Mainstreaming 

a.  Presence  (O=no  mainstreaming;  l=main- 
streaming  only;  2=pretreatment  plus 
mainstreaming) 


b.  Type  of  study  (0=can*t  tell;  l=planned; 
2=post  hoc) 


c.  Instruction  in  mainstreamed  classes  (Ocan't 
tell;  l=standard  group/class  instruction; 
2=cooperative  learning;  3=individualized 
instruction;  4=peer  tutoring;  5=connbination/ 

specify  ;  6=other, 

specify   


) 


d.  Special  personnel  support  (0=can*t  tell;  l=none; 
2=teacher  preparation;  3=consultant  help;  4=other/ 
specify  ) 


e.  Special  skills  training  for  disabled  students 
(0=can*t  tell;  l=none;  2=social;  3=academic; 
4=both) 


351 


73 


(End  of  Card  if2) 


Project  Code 


Report  ID^ 


Type  of  special  skills  training.    (O=not  appli- 
cable;  l=coaching;  2=mcdeling;  3=counseling; 
4=direct  reinforcement;  5=group  continger  cies; 
6=diagnostic/prescriptive;  7=cognitive  control; 

8=con±)ina ticn  /  spec  i fy    ; 

9=other/  specify  ~] 


g.  Special  instruction  for  ncndisafaled  p=5ers 
(0=can't  tell;  l=none;  2=infcrmation:  3=vicarious 
experience;  4=r8inforcement;  5=per3uasive 
messages) 

h.  Parenc  education  (0=can't  tell;  l=ncne;  2=disabled 
children;  3==ncncisabled  children;  4=2&3) 

i.  Type  of  parent  education  (0=can't  tell;  l=not  ^ 
applicable;  2=inforTnation;  3=vicarious  exper- 
ience; 4=reinforceir.ent;  5=^persuasive  messages) 

j.  Disabled  children  in  mains traamed  classes 

(0=can't  tell;  l=mildly  and  moderately  retarded; 
2=en:otionally  disturbed;  3=7isually  impaired;  ^ 
4=blind;  5=hearinc  impaired;  6=deaf;  7=comrnuni- 
cation  disordered;  8=physically  and  health 
impaired;  9=learning  disabled;  10=combination/ 

specify  ;  ll=others/  specify  

  ) 


k.  Number  of  minutes  handicapped  children  spend 
daily  in  mainstreafr:ed  class 


1.  Number  of  days  per  week  in  rainstreamed  class 


m.  Total  minutes  per  week  in  mainstrearned  class 


n.  Months  in  rrainstreamed  program  when  outcoras 
assessed 

o.  Outcome  measured  for  (l=nondisabled  peers;  2= 
teachers;  3=parents,  disabled  children;  4= 
paren ts,   nond isabled  children ;   5=adm i nis tra tors ; 
6=coiT±)inaticn/  specify  ) 


352 


07f 


7i) 


ES 


27 


28-29 


30 


31 


32 


ES 


33-34 


35 


ES  ES 
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D,    Dependent  Measures 

*1,  "Attitude"  defined  (0=no;  l=affective^  2=cognitive; 
3=behaviors;  4=i:i2;  5=1&3;  6=1,  2,  &  3) 


*2«  Number  of  dependent  measures.    List  (by  name  of 
instrument  and  form/  if  possible/  or  general 
category) : 

1.   .  4.   

 5.   

6.   


2. 
3. 


3*  Use  of  comnon  instrument  (O=no;  1=ATDP;  2=0MI; 
3=RGEPS;  4=ATHI;  5=ATBS;  6=MRAI;  7=MTAI/ 
revis3d;  8=cont)ination/  specify  ^^__) 


4.  Common  instrument  modified  (C>=no^  applicable;  l=no; 
2=for  different  population  of  Ss;  3=for  different 
disabilities;  4=other/  specify  ) 


5.  Source  of  data  (l=self-report;  2=opinion  of 

other/  teacher;  3=opinion  of  other,  administrator; 

4=opinion  of  other,  specify     ; 

5=observation;  6=nonproject  request! 


6.  Type  of  assessment  (l=interview/  structured; 
2=interview,  nonstructured;  3=attitude  question- 
naire; 4=sociometric  measure;  5=peer  assessment; 
6=social  distance  scale;  7=informal  observation; 
8=systematic  observation;  9=semantic  differen- 
tial;- 10=telephone  or  mail  survey;  ll=telephone 
or  mail  request;  12=Q-sort;  13=projective  test, 
pictures;  14=sentence  completion;  15=adjective 
checklist;  16=rankin9s;  17=other/  specify 

) 


7,  Source  of  instrument  (O=not  applicable;  l=can't 
tell;  2= teacher-made;  3=project  dt^veloped;  4=prior 
research;  5=other,  specify  ) 


ERIC 


353 


3S0 


'12- 


36 


37 


38 


8.  Development  by  project  (0=not  applicable;  l=des- 
cription  not  provided;  2=description  not  adequate; 
3=adequate  description) 


9.  Reliability  of  scores 

a.  Mentioned?    (0=no;  l=coeff icient  reported;  2= 
yes/  but  no  coefficient) 


39-40 


b*  Source  (0=not  applicable;  l=can't  tell;  2=com- 
puted  on  sample;  3=reported  from  other  research; 
4=pilot  study;  5=connbination/  specify  ) 


c.  Method  (0=not  applicable;  l=can't  tell; 
2=test-retest;  3=in^e  nal  consistency; 
4=alternate  forms;  5=inter-observer — %; 
6=inter-observer — r;  7=intra-observer — %; 
8=intra-observer — r;  9=categorization 
reliability;  lO=combination/  specify 
 ) 


41-42 


43 


d.  Coefficient 


e.  Magnitude  (Onot  applicable;  l=.8Ol.0; 
2=.6Q-.79;  3=.C>-.59) 


44 


45 


10.  Validity  of  scores 

a.  Discussed  (0=no;  l=qTKx3erately;  2=comprehensive5 


b.  Type  (0=not  applicable;  l=general;  2=face; 
3=construct-discrimination  or  correlations; 
4=construct — expert  judgment;  5=concurrent ; 
6=combination/  specify   ) 


46 


c.  Source  (0=not  applicable;  l=not  mentioned; 
2=general  reference  to  literature;  3=citation 
of  research;  4=project  data;  5=inference  without 
data;  6=cornbi nation/  specify  )# 


47 


48 


d.  Reactivity  of  measure  (l=low;  2=moderate; 
2=high) 


e.  Adequacy  of  validity  (l=low;  2=mcderate; 
3=high) 


ERLC 


381 


354 


49 


50 


51 


52 


53-56 


57 


58 


59 
60 
61 
62 
63 
64 

65 
66 
67 


ES 


ES 


ES 


ES 
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11.  Data  collection 


a.  Type  (0=can*t  tell;  l=regular  staff, 

2=researcher;  3=research  assistant;  4=noncontact; 
5=other,  specify   ) 


b.  Blinded  collection  (0=can*t  tell;  l=not 

applicable;  2=no;  3=partial — experimental  or 
controls  or  pre  or  post  only;  4=yes) 

12.  Blinded  scoring  (0=:can*t  tell;  l=not  applicable; 
2=no;  3=partial— experimental  or  control,  or 
pre  or  posttest  only;  4=yes) 


13.  Time  of  posttest  (0=can*t  tell;  l=iitxnediate; 
2=deia/ed;  3=follow-up) 


14.  Weeks  after  intervention  to  posttest 


E.    Internal  Validity 

1.  Design  (l=pre-post,  control;  2=posttest-only; 
3=Solomon  4-group;  4=nonequivalent  control  groxjp; 
5=single  subject;  6=pre-post,  one-group;  7=static 
group;  8=other,  specify  ) 


2.  Assignment  to  groups  (0=can't  tell;  l=random; 
2=match- random;  3=select  or  match  from  different 
group;  4=random  assignment  of  intact  groups; 
5=convenience;  6=not  applicable;  7=other, 
specify   ) 


3.  Threats  ^  ^ 

a.  Treatment  Validity  (from  C.9.k.,  p.  9) 

b.  Maturation 


c.  His^ory 

d.  Testing 

e.  Instrumentation 

f.  Statistical 
regression 


(0=can»t  tell;  1^- 
not  a  plausible 
threau;  2=minor 
threat;  3=substan- 
tial  threat;  4=major 
problem.    For  3 
&  4,  write  reason 
next  to  item.) 


g.  Selection 

h.  Expe;-^ mental  mortality 

General  Internal  Validity  (l=high; 
2=mediun;  3=low) 


ERIC 
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ES_ 

ES_ 

ES_ 

ES_ 

F.  Results  * 

68 

1. 

Statistical  significance  (Onot 
available;  l=not  significant  at  .05 
level;  2=significant  at  the  .05  level) 

2. 

Author's  conclusions  about  effectiveness 

69 

- 

- 

a.  Qualified?  (0-no;  l=saniple;  2= interac- 
tions; 3=design;  4=measures;  5=treat- 
ment  verification;  6=replication; 
7 -other/  specify                           ;  ^ 
8=combinatidn/  specify  ) 

70 

- 

— 

— 

b.  Treatment  effective?  (Onone;  1= 

didn't  have  effect;  2==data  equivocal; 
3=produced  effect;  4=produced  negative^ 
effect) 

3, 

Effect  size(s) 

71 

— 

— 

— 

- 

a.  Available  (0=no;  l=i^es)  • 

b.  D. 

72- 
77 

1)  +  D 

• 

(End  of  Card  #3} 

******************************************** 


1- 

2 

M  A 

M  A 

M  A 

M  A 

Project  code 

• 

3- 

6 

Report  ID# 

7- 
8 

0  4 

0  4 

0  4 

0  £ 

Card  # 

9- 
10 

2)  Source  (0=not  applicable;  l=re- 
ported;  2=calculated;  3=t  or  ANOVA 
F;  4=cor related  t;  5=correlated  t 
(.50);  6=n-way  ANOVA;  7=00VAR; 
8=00VAR  (.50);  9=proportions/  chi-  ^ 
square;  10=other  nonparametric; 
ll=rp|3;  12=significance  level; 
13=other,  specify  ) 

11 

3)  Scale  of  X  difference  (O=not  appli-^ 
cable;  l=raw  post;  2=raw  gain;  3= 
covariance  adjusted;  4=residual  gair 

ERIC 
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4)  Standard  deviation  (O=not  appli- 
cable; l=post  control/placebo; 
2=pretest;  3=1&2  pooled;  4=l-way 
ANOVA  SSs;  5=n-way  AbWVA  SSs; 
6=1- way  ANOOVA  SSs;  7=n-way  ANC»VA 
SSs;  8=adjusted  OOVAR  sd;  9=adjusted 
gain  sd;  10=pooled  post) 


5)  Mdn  primary  D  for  type  of  assessment 


6)  Mdn  primary  D,  overall 


Correlation 


1)  Type  (Onone;  l^^r  v^*  2=e2;  3=Phi; 
4=Cramer's  V;  5=otner/  specify 


2)  +  coefficient 


3)  Source  (O=not  applicable;  l=reported; 
2=calculated;  3=estimated  from  D) 


4)  Mdn  primary  coefficient  for  type  of 
assessment 


5)  Mdn  primary  coefficient/  overall 


d.  Variance  ratio 
1)  Ratio  (Sg/S^) 


2)  Mdn  pr5mary  variance  ratio/  type 
of  assessment 


3)  Mdn  primary  variance  ratio /  overall 


357 
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ES__ 

ES__ 

ES_ 

G.  Supplemental  Information 

1»  Information  aain  ^ 

52 

— 

— 

— 

a.  Reported?  (C>=not  applicable;  l=yes; 
2=no) 

53 

— 

b.  Type  of  report  (O=not  applicable;  1=^ 
verbal;  2=statistical  significance; 
3=descriptive  statistics;  4=2&3) 

54 

— 

- 

— 

— 

c.  Conclusions  (O=not  applicable; 

l=clear  gain;  2=no  gain;  3=mixed  ^ 

results*  4=ran  *  f  fpl  l/i  nr*nnr»1  n<;i  xro^ 

55- 
60 

• 

• 

• 

d.  Effect  size — D 

61 

• 

2*   TVOe  of  Studv  fO=r;^n'f   fpl  1  •  l=:mnr<?p 

evaluation;  2=prograin  evaluation; 
3=experimental  treatment;  ^=main- 
streaming/  variation  in  time) 

-  # 

H,  Coding  Summary 

62- 
64 

— .  .... 

1.  Minutes  spent  coding 

65 

2.  Coder  (l=Curtis;  2=Jesunathadas;  • 
3=Shaver;  4=Strong) 

RETURN  TO  PAGE  1  AND  COMPLETE  CHECKLIST 
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COMMENTS  ON  STUDY 

(Brief  abstract,  if  not  on  report;  special  strengths,  weaknesses,  significance; 
•  nuances  of  treatment,  design,  assessment;  conclusions) 
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^  ID  #  or  #'s  of  other  reports: 


2.  Coding  conplete  except  for  ES's, 

3.  ES's  computed 


Date 

  4.  ES's  checked 

Date 

  5.    Abstract  available 

  report 

connment  sheet 

  6.  COMMENTS  ON  COSIVE^TTIONS  sheet 

  7.  COMMENTS       STUDY  sheet 

  8.  Additional  information  needed  from  author (s) 

  a.    Form  completed 

  b.    Request  sent   


Date 

c.    Additional  information  received 


Date 

d.    Follow-up  or  thank  you  sent   

Date 

9.  A  vs.  C/  B  vs.  C  Log  Sheet 
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APPENDIX  C* 
CODING  CONVENTIONS 


(1)  Conventions  for  Use  of  Coding  Instrument 

(2)  Computation  ot  Effect  Sizes 

(3)  Conventions  Addenda 


"The  pages  for  Appendix  C  are  numbered  in  two  ways:  (1)  consecutively  with 
the  rest  of  the  report,  and  (2)  based  on  the  coding  instrument  categories 
and  page  numbers,  as  discussed  in  Chapter  3.  The  first  number  is  (1);  the 
number  following  the  colon  is  (2). 
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META-AMALYSIS:    MODIFYING  ATTITUDES  TOWAE®  DISABLED  PFIRSONS 
CPNVENTIQNS  FOR  USE  OF  CODING  INSTRUMENT 

General  Instructions 

1.  Use  of  Numerical  List,  When  you  pick  up  a  report  to  code,  put  your 
initials  and  the  date  next  to  the  report  ID#  on  the  Numferical  List  of 
reports.  When  a  report  is  used  for  a  reliability  check/  write  "Rel."  and  the 
date  next  to  the  ID#  on  the  Numerical  List.  If  the  information  to  code  one 
or  more  primary  effect  sizes  is  available/  but  information  for  one  or  more 
others  has  to  be  requested/  indicace  that  on  the  Numerical  List  by  writing 
"Info.  Req."  by  your  initials.  When  the  information  arrives  and  you  are  able 
to  code  the  report/  scratch  out  the  "Info.  Req."  note.  If  no  ES  can  be  coded 
without  additional  information/  have  the  report  removed  from  the  Numerical 
List  and  make  a  marginal  note  on  the  Alphabetical  List  that  information  has 
been  requested.  Put  it  back  on  the  listS/  with  a  new  number/  when  the  needed 
information  arrives.  The  purpose  of  these  procedures  is  to  make  the 
Numerical  List  the  source  for  a  quick  check  on  the  status  of  coding — both  for 
general  information  and  for  deciding  when  reliability  checks  are  appropriate. 

2.  EFFECT  SIZES  and  INFORMATION  REQUEST  Sheets.  The  first  step  in  coding  a 
report/  after  skimming  it  to  get  a  general  sens^  of  purpose  and  procedures/ 
is  to  till  out  an  EFFECT  SIZES  sheet*  If  information  is  not  available  to 
compute  the  primary  ESs,  do  not  code  the  report.  Complete  an  INFORMATION 
REQUEST  sheet  and  give  it  to  the  secretary.  Attach  the  report  and  the  EFFECT 
SIZES  sheet  to  the  REQUEST  sheet.  Once  a  letter  is  written  requesting  the 
information/    the  report  and  ESs  sheet  will  be  put  in  the  "Information 
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Requested"  stack  in  Room  423.  When  the  information  arrives,  retrieve  the 
report  and  code  it. 

3.  OOMMENTS  ON  STUDY  Sheets.  Complete  a  COMMENTS  0(SI  STUDY  sheet  for  every 
report  you  code,  even  if  the  only  comment  written  on  it  is  "No  comments"  or 
•'Abstract  in  article".  If  the  report  does  not  have  an  abstract,  write  a 
brief  one  on  this  sheet.  Be  certain  to  write  on  this  sheet  any 
characteristics  of  the  study  that  may  be  helpful  or  interesting  in  later 
organization  or  interpretation  of  analysi^s.  Such  things  as  gender  balance  if 
exact  figures  are  not  available  to  be  coded,  other  design  or  sample 
characteristics,  treatment  characteristics,  interactions  of  in'»-erest  but  not 
coded,  authors*  comments  indicating  special  significance  or  interpretations 
of  results,  and  good  discussions  of  test  validity  are  among  the  types  of 
comments  to  record.  Also,  note  the  reasons  for  important  coding  decisions 
that  may  be  questioned  later — such  as  why  a  dependent  measure  was  not  used  in 
an  effect  size. 

4.  EFFECT  SIZE  COMPUTATION  Sheets.  Be  certain  to  record  all  major  steps  in 
your  computations  of  £*s,  correlation  coefficients,  and  variance  ratios.  If 
additional  sheets  are  needed,  enter  the  Report  ID#  on  each  and  staple  them  to 
the  ES  COMPUTATION  sheet  for  the  study.  Submit  COMPUTATION  shoets  along  with 
the  Coding  Instrument  and  other  sheets  to  be  filed  once  you  complete  the 
coding  of  the  study. 

5.  OOMMENTS  ON  CON^/ENTIONS  Sheets.  Whenever  you  have  difficulty  coding  a 
study,  note  on  a  COMMENTS  ON  CONVENTIONS  sheet  the  difficulty  and  how  you 
resolved  it.  The  difficulty  might  involve  how  to  define  groups  for  ESs,  an 
instrument  that  doesn't  fit  well  into  any  category  in  0.6.  (p.  11),  a  study 
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which  involved  more  than  the  999.9  total  hours  provided  for  in  C7*3.  (p-.  7), 
difficulties  in  fitting  delivery  modes  to  the  categories  in  a4.a.(2)  (p.  6), 
and  so  on.  We  will  use  the  COMMENTS  ON  CONVENTION  sheets  both  in 
interpreting  analyses  and  in  writing  the  final  report. 

6.  Entering  Numerals.  Be  sure  that  the  numbers  you  write  or  the  Coding 
Instrument  can  be  read  by  the  person  keypunching.  That  means  that  they  must 
be  dark  enough  to  be  legible,  and  written  carefully  so  that  numerals  will  not 
be  confused. 

7.  Coding  Time.  It  is  best  to  code  a  report  in  a  single  sitting  if 
possible,  enter  your  start  and  stop  time  at  the  top  of  page  1  so  that  you 
can  complete  item  G.l.  at  the  end  of  the  Coding  Instrument*  If  you  are  not 
able  to  complete  a  report  in  one  sitting,  be  3ure  to  repeat  your  previous 
reading  to  the  point  where  you  have  an  adequate  context  for  the  coding, 
.^vlso,  enter  the  stop  and  start  times  for  eaci)  of  the  sittings  so  that  you  can 
add  up  the  times  for  category  G.l. 

8.  Completing  Spaces.  Each  code  space  must  Iiave  something  in  it.  Be  sure 
to  fill  in  all  spaces/  including  leading  zero's  (eg,  ID.  #0024),  piusCi3  and 
minuses,  and  x's.  Many  of  the  categories  in  the  Coding  Instrument  have  a 
code  for  indicating  "not  applicable"  or  "can't  tell."  In  cases  where 
specific  data  are  called  for,  such  as  the  percentage  of  males  in  the 
experimental  and  control  groups,  and  the  data  are  not  available,  insert  x's. 
Also,  if  an  ES  is  coded  for  a  pre*-post,  single-group  design,  or  similar 
design,  enter  x's  in  any  control  group  spaces,  if  a  ^TSIot  Applicable"  code  is 
available. 
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9.  Completing  Checklist,  Be  sure  to  go  over  all  itemn  on  the  checklist 
after  coding  a  study.  A  REPORT  DISPOSITION  sheet  should  be  completed  on 
every  study  aid  COMMENTS  ON  CONVENTIONS  and  COMMENTS  ON  STUDY  sheets  should 
be  completed  as  appropriate.  On  the  COMMENTS  ON  CONVENTIONS  sheet,  note  any 
special  difficulties  in  coding/  any  use  of  "other"  or  "combination" 
categories  that  raise  questions  about  the  adeq'jacy  of  the  other  categories, 
and  any  decxsions  not  covered  by  the  Conventions  that  you  had  to  make  about 
category  definitions  in  order  to  code  the  study.  The  COMMENTS  ON  STUDY  (COS) 
sheets  will  provide  a  source  of  information  about  each  study  that  will  be 
more  easily  accessible  than  items  on  the  completed  Coding  Instrument.    If  the 

report  does  not  include  an  Abstract,  write  a  brief  abstract  on  the  COS  sheet.  ^ 

Also,  note  anything  that  makes  the  study  particularly  noteworthy,  such  as 

particular  strengths  or  weakness,  unusualness  of  approach  or  population. 

Also  note  any  nuances  of  treatment,  design,  or  assessment  that  may  not  be  ( 

obvious  or  noticed    on  the  Coding  Instrument.    In  analyzing  the  data  and 

writing  up  the  results,  it  will  be  important,  but  difficult,  to  "keep  in 

touch"  with  the  many  studies  we  will  have  coded.    The  COS  sheets  will  be  { 

crucial  both  for  alerting  us  to  significant  study  characteristics  and 

variaLions  and  for  identifying  studies  to  review  again/  and  even  re-code,  to 

assist  in  our  interpretations  and  discussions.  I 

10.  One  Study/Multiple  Reports.  When  multiple  documents  report  analyses  of 
the  same  data,  they  should  be  coded  as  a  single  study.    One  document  may  be 

i 

adequately  comprehensive  to  include  others  (e.g.,  a  dissertation  which 
encompasses  one  or  more  articles  published  from  it/  or  a  preliminary  report 
encompassed  by  a  final  report).  Assign  an  ID#  to  the  main  report  and  list 
the  other  documents  v/ith  the  same  ID#,  but  with  A,  B/  etc.  added.    Or,  the 
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data  in  multiple  reports  may  all  come  from  one  study^  even  though  no  ono 
document  is  totally  comprehensive;  in  that  case,  score  the  fnultipi<?  reports 
as  if  they  were  one.  Assign  an  ID#  to  one  of  the  reports  (the  major  one,  if 
such  can  be  identified;  or,  failing  that,  the  one  with  the  earliest  date;  or, 
finally,  by  alphabetizing  authors'  names,  then  titles).  Use  that  ID#  with  an 
A,  B,  or  C,  etc.  (ordering  by  authors'  last  name  and  then  by  date  of 
publication)  on  the  other  reports.  Use  only  the  ID#  for  the  coded  or  major 
document  in  the  data  columns  of  the  coding  sheet  but  be  certain  that  the 
Coding  Instrument  and  the  REPORT  DISPOSITION  sheet  have  recorded  on  them  the 
identification  number  of  any  subsumed  document.  On  the  EFFECT  SIZES  sheet 
and  at  the  bottom  of  page  one  of  the  Coding  Instrument,  where  the  ESs  are 
described,  use  the  ID  number  and  letters  to  indicate  the  articles  from  which 
the  different  effect  sizes  were  taken. 

11.  Reading.  Typically,  you  should  skim  each  report  before  attempting  to 
score  it.  Often  you  will  find  information  in  unexpected  places.  For 
example,  you  may  find  information  de^ '♦•ibing  the  sample  in  the  Conclusions 
rather  than  in  the  Sample  section.  In  dissertations,  the  Acknowledgments  can 
be  a  rich  f^urce  of  information — for  example,  in  regard  to  wheLhsr  persons 
other  than  the  author  carried  out  the  treatment  or  .gathered  data. 

12.  Rounding  Computations.  In  computing  effect  sizes  (ESs),  carry-out 
computations  to  the  fourth  decimal  place  and  then  round  the  ES  to  two  decimal 
places.  (You  may  also  need  to  round  in  inserting  other  numerical  data.) 
Round  up  if  the  number  in  the  third  decimal  place  is  6  or  higher,  if  it  is  a 
5  followed  by  a  5  or  higher,  or  if  it  is  a  5  followed  by  a  5  which  is 
follo'^ed  by  a  6  or  higher. 
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CODING  INSTRUMENT 
A.  GENERAL  INFORMATION 
The  Coding  Instrument  is  set  up  with  four  columns  for  effect  sizes. 
Some  studies  may  have  fewer  than  four  effect  sizes,  some  may  have  mot  »  When 
there  are  more  than  four  effect  sizes,  use  multiple  Coding  Instruments.  The 
ES  numbers  should  be  filled  in  at  the  top  of  each  sheet.  Complete  an  EFFECr 
SIZES    sheet  before  assigning  BS  numbers. 

A.l.  Project  code.  To  identify  IBM  cards  for  the  project,  the  code  M  A 
(meta-analysis)  will  be  punched  in  each.  The  project  code  is  typed  on  the 
Coding  Instrument. 

A.2.  Report  ID#.  Write  in  the  ID  number  of  the  study  for  each  SS.  Be  certain 
to  fill  in  all  spaces — using  zero's  not  x's.  The  ID  number  will  be  the  same 
for  every  effect  size  for  a  study.  For  multiple  documents  scored  as  one,  use 
one  IDS  (see  prior  item  8). 

A.3.  Card  numbers.  Card  numbers    ^-'111  be  typed  on  the  code  sheets. 

A.4.  Year  of  publication.  It  is  assumed  that  all  studies  will  have  been 
published  in  the  twentieth  century,  so  insert  the  last  two  numbp^rs  of  the 
pub]ication  yeav.  Note  that  for  dissertations,  the  date  of  tba  dissertation 
is  the  number  to  enter,  not  the  date  on  which  it  was  included  in  Dissertation 
Abstracts.  If  a  report  does  not  have  a  date  op  it,  go  to  its  list  of 
references  and  find  the  most  recent  reference,  add  two  years,  and  then  enter 
that  date.  Note  the  asterisk  (*4.).  Whenever  an  asterisk  is  included  for  a 
category,  the  information  is  the  same  across  all  b)ba. 
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A.  5.  Type  of  report,  'TDissertation'*  refers  t'^  a  doctoral  study  and  "thesis" 
refers  to  a  master's  study.  Unpublished  report  ("7")  includes  all 
unpublished  reports  other  than  convention  papers  ("6"),  including  government 
reports/  reports  that  are  part  of  occasional  paper  series,  and  mimeographed 
publications  of  a  research  center.  If  multiple  reports  are  available  for  the 
same  study  (e.g./  a  dissertation  and  a  journal  article)/  code  "9"  and  enter 
the  types  of  reports  usee. 

A.6.  Effect  Si2e(s). 

Description  of  ES  Comparison ( s) 
The  number  of  effect  sizes  is  a  function  of  the  number  of  dependent 
measures  and  the  number  of  comparisons  which  are  of  interest.  Complete  an 
EFFECT  SIZES  sheet,  providing  a  brief  description  of  each  effect  size, 
including  the  dependent  measure  and  the  groups  compared  (see  A.6.d.  below). 
List  primary  effect  sizes  (see  A.6.C.,  below)  first,  followed  by  secondary 
effect  sizes.  Then,  enter  in  the  left  hand  column  of  the  Coding  Instrument 
the  ES#  from  the  EFFECT  SIZES  sheet.  These  will  ue  the  numbers  which  you 
will  place  at  the  top  of  each  page  and  which  you  will  enter  in  A.6.b.  ID  of 
ES.  Also  enter  the  ES  descriptions.  Be  certain  to  paper  clip  the  EFFECT 
SIZES  sheet  to  the  Coding  Instrument  when  you  are  done  with  the  report. 

A.6.a.  N  of  ESs.  The  N  of  ESs  is  taken  from  the  EFFECT  SIZES  sheet.  Two  N's 
will  be  entered:  The  N  of  primary  ESs  and  of  secondary  ESs.  That  is  the 
number  of  primary  effect  sizes  will  be  entered  in  each  primary  ES  column,  and 
the  number  of  secondary  effect  sizes  will  be  entered  in  each  secondary  ES 
column.  Even  if  there  are  Prior  Contact  secondary  ESs  (see  A.6.d.,  #*s  13, 
14,  15),   enter  the  number  of  secondary  ESs  coded  on  the  original  Coding 
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Instrument  (pp.  1-16)  only.  The  number  of  Prior  Contact  secondary  ESs  will 
be  coded  on  page  1  (1.9)  of  the  Prior  Contact  Coding  Sheet,  For  studies  for 
which  information  is  lacking  to  code  ESs,  but  for  which  statistical 
significance  results  are  a-^oilable  (coded  on  the  ES  Information  Missing 
CODING  INSTRUMEMT)/  the  N  of  ESs  is  the  number  you  would  code  if  ES  data  were 
available.  If  there  are  ^.wo  sets  of  ESs  for  a  report/  one  coded  on  a  regular 
and  one  on  an  ES  Information  Missing  CODING  INSTRUMENT—i.e./  ID#  (*) — the  N 
of  ESs  is  entered  separately  for  each  set. 

A.6.b.  ID  o£  ES .  Fill  in  the  numbers  assigned  on  the  EFFECT  SIZES  sheet. 
Note  that  although  the  N  of  ESs  is  figured  separately  for  regular  and  Missing 
Information  sets  of  ESs/  ES  ID  numbers  are  to  be  assigned  serially  for  all 
ESs/  with  the  regular  ESs  numbered  first.  The  keypunch  operator  will  punch 
the  ES  number  from  item  A.6.b.  You  should  write  ES  numbers  at  the  top  of 
each  page  for  your  own  information  as  you  code. 

A.6.C.  Level.  Two  levels  of  ESs  will  be  scored:  Primary  and  secondary.  A 
"primary"  effect  size  is  one  which  involves  the  comparison  of  a  treatment 
group  with  a  control  or  placebo  group  or  the  comparison  of  two  treatment 
groups.  If  a  delayed  posttest  is  administered/  treatment  vs.  control/placebo 
or  Treatment  A  vs.  B  comparisons  for  it  also  yield  primary  ESs.  For  pre- 
experimental  single-group  designs/  pre-posttest  comparisons  yield  primary 
ESs. 

"Secondary"  effect  sizes  are  of  two  types:  Interactions/  and  treatment 
comparisons  within  levels  of  classification  variables/  i.e./  within  a 
subgroup.  For  example/  in  addition  to  comparing  an  experimental  and  a 
control  group  for  the  total  sample/  data  may  be  available  for  a  treatment- 
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control  comparison  for  males  and  for  females.  These  data  may  be  part  of  a 
table  for  which  there  is  an  interaction  effect  size/  or  may  be  presented  in  a 
table  even  though  the  interaction  effect  was  not  analyzed. 

A.6.d.  Type  of  comparison.  A  "treatment  group"  is  one  to  which  some  treats 
ment  or  intervention  has  been  applied.  In  some  studies/  interventions  to 
modify  attitudes  will  be  compared  with  each  other/  with  no  control  or  placebo 
group.  This  will  yield  "Treatment  A  vs.  B"  comparisons.  If/  however/ 
Treatment  A  and  Treatment  B,  or  Treatmtiit  A-j^,  A2/  A3  (i.e./  the  same 
treatment  is  applied  to  different  groups)/  are  each  compared  to  a  control  or 
a  placebo  condition  (C)/  code  the  A  vs.  C/  B  vs.  Cs  or  vs.  C,  A2  vs.  C, 
etc.  ESs/  but  not  A  vs.  B  or  A^^  vs.  A2/  etc.  Enter  the  study  on  the  A  vs.  C/ 
B  vs.  C  Log  Sheet  for  easy  identification  in  case  we  later  want  to  run  A  vs. 
B  analyses.  A  "control  group"  is  one  to  which  no  treatment  has  been 
applied/  while  a  "placebo  group"  is  one  which  receives  a  treatment  intended 
to  have  no  effect.  In  a  study  with  more  than  one  control  or  placebo  group/ 
unless  the  groups  are  clearly  different/  pool  the  means  to  compare  with  the 
treatment  means  in  ESs.  With  a  separate- sample/  pretest-posttest/  control 
group  design  (Campbell  &  Stanley/  p.  54)/  code  the  design  as  "1"  and  code  the 
pretest  group  as  a  control  group  throughout. 

As  you  go  through  the  coding  sheet  /  you  will  find  that  data  spaces  are 
often  labeled  "experimental"  and  "control".  In  coding,-  experimental  = 
treatment.  Use  the  "control"  space  for  placebos/  too.  For  Treatment  A  vs.  B 
comparisons/  use  the  "experimental"  space  for  what  you  have  labeled  on  page  1 
as  "Treatment  A"  and  the  "control"  space  for  what  you  have  labeled/ 
"Treatment  B".  If,  in  a  comparison  of  two  treatments/  one  can  be  expected  to 
be  rriore  powerful  than  the  other/  label  it  "Treatment  A\    For  example/  if  one 
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group  is  provided  information  about  disabled  persons  in  an  effort  to  change 

•  attitudes  and  another  group  in  the  same  study  is  provided  the  information 
plus  contact  with  disabled  persons,  the  latter  treatment  would  be  assumed  to 
be  more  powerful,  with  its  combination  of  elements.    It  would  be  labeled 

•  Treatment  A,  and  coded  in  the  "experimental"  group  spaces.  If  no  potential 
difference  in  treatment  power  can  be  discerned,  the  labeling  of  Treatment 
A's  and  B's     (and  C's  and  D's,   etc.,    if  necessary)  is  arbitrary.     It  is 

•  essential  in  any  case  that  the  same  order  be  preserved  in  entering  data 
throughout  the  codin'}  sheet. 

"Pre-post"  effect  sizes  involving    pre-posttest    data  from  the  same 

•  group  ("4"),  will  be  coded  only  for  single-^group,  pre-post  designs  and  will, 
for  that  pre-experimental  design,  be  primary  effect  sizes. 

Factorial  designs  yield  secondary  ES's  to  be  coded.     Four  types  of 
^  interactions  are  identified  specifically  on  the  Coding  Instrument:  Treatment 

by  gender  ("5"),  treatment  by  age/grade  ("6"),  treatment  by  testing  ("7"), 
and  treatment  by  personality  ("8").     In  addition,     there  is  a  place  to 
^  specify  an  interaction  involving  some  other  classification  variable  ("9"). 

"Treatment  by  personality"  refers  to  an  interaction  between  the  treatment  and 
levels  of  a  personality  measure  of  some  sort.  Specify  the  personality 
measure  in  the  space  provided.  For  interactions,  Eta"^  will  be  the  effect 
size.  Typically,  the  only  wi thin-group  secondary  ESs  to  be  coded  (including 
for  single-group,  pre-post  designs)  are  for  gender  ("10";  "11").  But  if 
results  are  reported  by  grade  (or  age)  level  and  differ  across  grade  (or  age) 
levels,  code  within-grade  or  within-age  level  ESs.  In  a  factorial  analysis 
in  which  the  ttcdtment  by  gender  (sex)  interaction  (or  treatment  by  grade 
^  interaction)  is  reported,  within-gender  (or  within-grade  or  within-age  level) 
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ESs  are  redundant;  code  only  the  interaction  ES,  but  note  on  the  COMMENTS  ON 
STUDY  sheet  the  pattern  of  treatment  by  gender  or  treatment  by  grade  means. 
The  "other"  category  ("12")  would  be  coded  for  within-grade  or  within-age 
ESs/   or  for  another  such  comparison  of  special  interest. 

Categories  "13"/  "14"  and  "15"  are  used  for  Prior  Contact  ESs.  Enter 
"13"  if  there  is  an  analysis  of  prior  contact  levels  within  the  treatment 
group;  enter  "14"  if  there  is  analysis  of  treatment  effects  within  prior 
contact  levels  across  groups  (as  treatment  effects  within  levels  of  gender 
were  coded);  enter  "15"  if  the  interaction  of  prior  contact  and  treatment  is 
analyzed.  If  "15"  can  be  coded,  do  not  code  "13"  or  "14".  If  "14"  can  be 
coded/  do  not  code  "13".  If  more  than  one  definition  of  prior  contact  (e.g., 
number  and  frequency)  are  analyzed  separat*^  /,  or  if  qtiality  of  prior  contact 
is  cinalyzed  separately,  each  is  the  basis  for  an  ES. 

A, 7.  Target  Population.  The  target  population  is  tne  universe  or  group  to 
which  the  researcher  would  like  to  generalize.  "Zero"  is  scored  if  there  is 
no  mention  of  such  a  group.  "1"  is  used  if  the  term.  "  target  population," 
is  used  in  discussing  the  purpose  of  the  study.  "2"  is  scored  if 
characteristics  of  the  target  population  are  mentioned  explicitly — as,  e.g., 
"The  results  of  this  study  are  meant  to  be  generalizable  to  the  attitudes  of 
middle-class  high  school  students  toward  persons  who  are  blind."  If  the 
target  population  is  actually  described  using  data,  such  as  from  national 
tests  or  the  census;  "3"  is  to  be  scored.  Combinations  to  be  scored  are 
indicated.  Codable  references  to  target  populations  will  be  found  in  the 
sections  on  Purpose  and  Methods.  Despite  what  is  said  on  item  10  above  (p. 
4)  in  regard  to  finding  information  in  unexpected  places,  do  not  infer  the 
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target  population  from  the  Discussion  or  Conclusions  sections,  as  the  issue 
is  whether  the  target  population  was  identified  prior  to  the  study. 

A.8.  Accessible  population^.  The  accessible  population  is  the  group  from 
which  the  researcher  drew  the  sample  or  samples.  It  may  be  identical  with 
the  target  population.  Or,  when  the  researcher  uses  all  of  an  accessible 
group~e.g.,  all  of  the  students  in  the  fifth  grade  in  a  school  district  or 
all  students  in  a  college  of  education — the  accessible  population  and  sample 
are  identical.  If,  however,  the  researcher  uses  an  "available"  subgroup 
(e.g.,  50  students  enrolled  in  two  sections  of  a  college  course  or  two 
classrooms  of  fifth-grade  students  in  a  school  or  school  district),  that 
group  is  his/her  sarople,  "drawn",  however  inappropiately,  from  a  laroer 
accessible  population  (e.g.,  all  students  enrolled  in  the  course  and  similar 
course,  all  students  in  a  college,  all  fifth-grade  students  in  a  school  or 
school  district).  This  distinction  may  be  difficult  to  make.  But  the 
question  to  ask  is,  were  other  similar  Ss  available  within  the  context  from 
which  the  researcher  selected  the  group  or  groups  to  use?  Researchers 
usually  consider  the  convenient  (i.e.,  captive,  intact)  groups  they  use  to  be 
samples;  rarely  is  it  intended  that  the  convenient  sample  encompasses  the 
entire  accessible  population.  In  meuiy  cases,  the  sample  will  be  defined  or 
described,  but  not  the  accessible  population;  and,  rarely  are  they  identical 
unless  the  researcher  says  so  specifically. 

If  the  term  "accessible  population"  is  used  to  describe  the  group  from 
"which  the  sample  was  selected,  code  "1".  If  general  characteristics  of  the 
accessible  population  are  indicated,  code  "2".  For  example,  the  report  m^ght 
say,  "Junior  and  senior  students  were  selected  from  four  urban  Chicago  high 
schools^"    If,   however,   the  characteristics  pf,  ^the  accessible  population  are 
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described  based  on  data  available  prior  to  or  ns  part  of  the  study/  code  "3". 
For  example,  the  following  would  be  a  "3":  "The  sample  was  drawn  from  the 
senior  class  of  nursing  students  at  the  University  of  Illinois.,  Made  up  of 
60%  females  and  40%  males,  30%  had  at  least  one  year  of  experience  working 
in  a  hospital."  Use  codes  for  combinations  of  1,  2,  and  3,  as  appropriate. 
Note  that  in  many  cases  the  researcher  may  describe  the  sample  for  the 
research  but  not  mention  accessible  population.    In  that  case,  code  "0". 

9.  Replication.  Replication  involves  repetition  of  a  study  to  determine  if 
the  findings  will  hold  up  under  either  the  same  or  different  conditions. 
"Direct  replication"  refers  to  a  study  conducted  with  the  sam*^  population  and 
treatment  conditions  as  the  prior  study.  "Systematic  replication"  involves 
planned  variation  in  population  and/or  conditions.  A  replication  may  be  a 
repetition  of  prior  research  (A.9.a.),  or  there  may  be  a  replication  within  a 
study  (A.9.b.),  as  when  a  researcher  carries  ouc  a  study  at  different  grade 
levels  or  in  different  schools  or  with  subjects  with  different 
characteristics  to  determine  if  the  results  will  hold  up  across  the  various 
conditions.  A  study  may  involve  both  replication  of  other  research  and 
replication  within  itself.  If  research  is  reported  in  which  there  are 
applications  of  the  treatment  to  different  populations  of  Ss,  which  should 
not  be  pooled  to  get  ESs,  code  "3"  in  category  A.9.b.  within  study,  even  if 
^he  researcher  apparently  had  not  intended  to  carry  out  a  replication,  and 
note  what  you  have  done  on  the  Coding  Instrument  and  on  the  COMMENTS  ON 
CONVENTIONS  sheet.  If  the  chronological  order  of  within  study  replications 
is  clear,  those  applications  of  treatment  following  the  first  one  are  coded 
as  replications;  if  the  chronological  order  is  not  clear,  code  all 
applications  as  replications. 
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B.  DESCRIPTIOI^  OF  SAMPLE(S) 
B.l.  a.  of  total  sample.  The  number  to  be  in;:5ertevi  liere  for  primary  ESs 
is  the  total  number  of  subjects  included  in  the  effect  size  data  at  the  time 
cf  data  analysis — that  is,  the  number  in  the  comparison  with  which  the 
project  ended,  not  started.  If  N  is  presented,  and  no  other  information  is 
provided  or  inferable  from  analysis  tables,  assume  that  it  is  the  N  at  time 
of  data  analysis.  Sometimes  the  total  N  will  not  be  given  but  can  be 
inferred  rather  precisely  from  references  to  the  number  of  groups  and  whether 
they  are  of  equal  size,  or  from  the  degrees  ol  freedom  in  anal^'sis  tables. 
For  secondary  ESs,  enter  the  N  at  data  analysis  tor  the  Ss  included  in  the 
particular  ES.  For  example,  if  the  total  N  was  100  with  50  females  and  50 
males,  100  would  be  entered  for  the  primary  hSs  (if  they  involved  all  Ss)  and 
50  each  for  the  treat  ment-within-gender,  male- female  secondary  ESs.  If  the 
number  of  classrooms,  but  not  the  number  of  students,  is  given,  use  25 
Ss/class  at  the  elementary  level  and  30  Ss/class  at  the  secondary  school 
level  to  estimate  N  and  n's  unless  there  is  an  indication  that  regular 
classes  were  not  used. 

B.l.b.  of  experimental  Ss.  This  is  the  number  of  students  in  the  analysis 
(see  B.l. a.  above)  for  the  treatment  group  for  the  particular  effect  size 
being  scored.  In  a  pre-post,  one-group  design,  enter  the  group  N  here  and 
put  x's  in  the  control  group  spaces  (B.I.C.).  For  interaction  ESs  which 
involve  more  than  two  treatment  groups,  enter  x*s  both  here  and  in  the 
control  group  spaces. 

B.l.c.  of  control  Ss.  Again,  this  is  the  number  of  subjects  in  the  control 
group  (or  Treatment  B  or  placebo  group)  for  the  analysis  (see  B.l. a.  above) 
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for  the  particular  effect  size  being  scored  For  pre-post,  one-group  designs 
and  interaction  ESs,  enter  x's. 

B.2.  Sample  selection.  Many  reports  do  not  mention  how  the  sample  of 
subjects  was  obtained/  and  it  r:i.y  not  be  obvious  from  the  discussion.  In  that 
case,  code  "0".  "Random"  sample  selection  refers  to  the  use  of  a  procedure 
which  insures  that  each  individual  has  an  equal  chance  of  being  chosen  from  a 
population.  In  order  to  code  "1"  there  must  be  explicit  mention  of  such  a 
process  or  use  of  the  term  "random"  in  reference  to  selection.  In  some 
studies,  the  researcher  will  solicit  persons  to  be  involved  or  will  use 
persons  who  voluntarily  come  into  a  program.  In  that  case,  "2"  should  be 
coded.  Research  is  often  done  with  "capL.ve/intact"  groups,  such  as  school 
classes  or  course  sections:  individuals  do  not  volunteer,  but  are  involved 
because  of  group  membership.  Then,  "3"  should  be  coded.  If,  however,  a 
researcher  uses  school  classes  as  the  selection  unit  and  then  asks  parents 
tor  permission  for  their  children  to  participate,  code  "2".  If  groups, 
rather  than  individuals,  are  randoiply  selected,  code  "4",  unless  groups  are 
the  unit  of  aidlysi5>,  then  code  "1".  This  category  is  to  be  coded  separately 
for  each  effecc  size  and  for  experimental  and  control  groups,  as  the 
selection  iiiethod  is  not  always  consistant  across  groups. 

B.3.  %  male.  As  with  Ns,  enter  for  those  Ss  involved  in  the  particular  ES. 
If  this  information  is  not  available,  enter  x*s.  Do  not  enter  a  decimal  point 
but  simply  a  three-digit  number.  For  example,  4%  would  be  entered  as  simply, 
004,  90%  as  090,  and  100%  as  100.  Do  not  estimate  percentages  for  treatment 
or  control  groups  based  on  total  sample  information  unless  there  is  evidence, 
such  as  mention  of  random  assignment,  that  the  subgroups  are  constituted 
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similarly  to  the  totai  group.  For  interaction  ESs,  enter  x*s  in  the 
experimental  and  control  spaces. 

B.4.  Treatment  context.  The  point  of  this  item  is  to  code  the  general  milieu 
of  the  research,  such  as  whethti:  it  was  in  a  public  school  context.  The 
question  is,  "What  was  the  treatment  a  part  of?" 

"Elementary  or  secondary  schooling"  ("1")  includes  preschool  as  well  as 
K-12«  "College/university  education"  ("2")includes  undergraduate  and 
graduate  education  in  degree-type  programs,  "Adult  education"  ("3")  refers  to 
noncredit,  noninservice  type  programs.  "Inservice"  ("4")  is  job-related,  but 
not  on-the-job  education*  "Work"  ("5")  refers  to  interventions  conducted  on 
actual  work  sites  while  people  are  working.  An  example  would  be  providing 
information  about  disabled  persons  to  MD's  as  a  part  of  the  staffing  of 
cases.  "Community"  ("6")  refers  to  studies  conducted  in  general  community 
settings,  such  as  in  a  shopping  mall,  through  newspaper  articles,  or  through 
a  broadcast  over  general  television  (i.e.,  not  closed  circuit  television). 
"Recreation"  ("7")  refers  to  contexts  such  as  parks,  playgrounds,  and  over- 
night camps. 

B.5.  Educational  level  of  Ss.  "Preschool"  ("1")  refers  to  subjects  younger 
t.ian  kindergarten  age  (age  five).  "Primary"  ("2")  includes  grades  K  through 
3.  "Intermediate"  ("3")  includes  grades  4  through  6.  "Middle  school"  ("4") 
typically  includes  grades  6  through  8.  While  this  category  may  overlap  with 
"intermediate,"  code  the  Ss  as  referred  to  in  the  study.  That  is,  if  they 
are  in  the  sixth  grade,  but  referred  to  as  middle  school  students,  enter  a 
"4",  but  if  they  are  in  the  sixth  grade  in  an  elementary  school,  code  "3". 
Medical   students   are  graduate   students,    "9".      The   point   of  this 
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categorization  is  to  code  educational  l^vel  as  relevant  to  the  intervention. 
For  example,  with  a  proc,ram  for  practicing  physicians  or  Lor  teachers,  Ss 
would  be  coded  as  post-professional  ("10")»  If  the  program  is  designed 
generally  for  adults  who  are  not  in  an  educational  program  and  who  vary  in 
educational  level,   then  "11"  should  be  scored. 

B.6.  University  students  according  to  major.  If  any  category  other  than  "8" 
or  "9"  IS  scored  in  the  item  immediately  above,  code  "O=not  applicable"  for 
this  item.  Otherwise,  enter  the  code  for  the  particular  major  or,  if  the 
major  is  not  included,   in  the  list,  enter  "15"  and  then  write  in  the  major. 

B*7.  Occupation  of  Non-student  Ss.  If  the  subjects  in  the  study  are 
students,  then  code  "0"  for  this  icem.  "Community  recreation  workers"  refers 
to  persons  who  work  in  such  places  as  playgrounds  and  parks,  including  over- 
night recreational  facilities — unless  they  clearly  fall  in  some  other 
category,   such  as  teachers. 

B.8.  Prior  experiences  with  disabled  persons.  If  the  report  indicates  that 
the  Ss  have  had  experience  with  disabled  persons  prior  to  the  research,  code 
the  appropriate  category.  If  the  report  indicates  that  disabled  persons  are 
in  the  same  setting — e.g.,  in  a  special  class  in  a  school,  or  in  jobs  at  a 
business — but  it  is  not  clear  whether  or  how  much  contact  the  Ss  have  had 
with  them,  code  "6"  or  "9";  as  appropriate.  Reports  may  indicate  that  the  Ss 
have  had  prior  experience  witn  disabled  persons,  but  not  specify  what  that 
experience  was.  In  that  case,  "11"  should  be  coded.  If  some  Ss  have  had 
experience  with  disabled  persons  and  others  haven't,  code  "12",  specifying 
"1"  and  the  appropriate  code  or  codes  for  the  experienced  S's. 
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B.9.  Country  of  subjects^  The  concern  here  is  not  where  the  research  was 
conducted  nor  the  country  where  the  report  was  published/  but  with  the 
country  of  the  Ss,  If,  for  example/  a  study  was  conducted  in  Germany  at  an 
American  dependent  school  with  the  children  of  U.S.  servicemen/  then  "1" 
would  be  coded* 
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C.  TREATMENTS/INTERVENTION 
C.l.  Basis^  Note  that  the  concern  here  is  with  the  rationale  for  the  type 
treatment  or  intervention/  not  with  the  rationale  for  doing  research  on  the 
topic  Sometimes  a  treatment  or  intervention  is  reported  with  no  indication 
of  how  the  treatment  was  arrived  at.  In  that  case/  score  "0".  If  the 
treatment  is  tied  explicitly  to  an  attitude  change  theory/  score  "1".  It 
wil?  often  be  l-ielpful  to  check  the  titles  of  citations  to  see  if  they  are  to 
research  reports  or  theoretical  discussions.  Also/  use  of  one  of  the  five 
category  labels  in  Table  1  (pp.  C-2  -  C-3)  is  not  necessary;  they  are  for 
organizational  purposes.  Look  rather  for  the  theory  names  under  "A. 
Theories"  in  Table  1.  If  the  report  simply  refers  to  prior  research  as  the 
basis  for  the  intervention  or  the  study/  but  without  citations/  score  "2". 
If  the  study/intervention  is  based  on  prior  research/  but  only  one  to  three 
or  so  studies  are  cited  loosely/  then  score  "3".  If  the  prior  research  is 
cited  and  discussed  as  part  of  a  well-developed  case  for  the  research/  score 
"4".  If  no  research  is  cited/  but  the  report  refers  to  an  experience  in 
general  or  to    experience  in  ^  particular  school  or  other  setting/  score  "5". 

C.2.a.  Attitude  change  theory.  In  the  prior  category/  a  "1"  indicates  that  a 
theory  was  the  explicit  basis  for  the  study/intervention.  If  that  was  the 
case/  then  code  the  theory  in  one  of  the  five  choices  for  this  category.  If 
the  cheory  base  was  not  explicit/  then  the  theory  to  which  the  treata^ent 
seems  most  highly  related  should  be  coded. 

Five  kinds  of  information  to  use  in  classifying  studies  according  to 
attitude  change  theory  are  presented  in  Table  1:  (1)  The  n<umes  of  attitude 
change  theories  which  fall  within  each  of  the  four  categories;    (2)  the  names 
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Table  1.    Ma^oc  CaCegocies  of  ACCiCude  Change  Theories 


Stimulua-Response  S 
Behaviociscic  Theories 


2.  Condi cicning  Theories 


3.  Conaiscency/Equilibrium 
Theories 


4.  Social  JucJgemenc  Theories    5.  Punc Clonal  Theories 


A.  Theories 

.  Hull-Spence  3ehavior 

Theory 
.  Reintorcemenc  Theory 
.  Yale  Comnunicacion 

Program 


A.  Theories 

.  Classical  Cond\ Cloning 
.  Skinner lan  Theor/ 
.  Ope ran c  Condi c ion ing 


A.  Theories 

.  Affeccive-Cogniciye 
Consiscency  Theory 

.  Balance  Theory 

.  Belief-Congruence  Theory 

.  Cognicive  Balancing 

.  Cognicive-Oissonance 
T^veoc/ 

.  Cong ru icy  Model  or 
Theory 

.  Consiscency  Theory 

.  Dissonance  Theory 

.  Cog4.cai-£Cfeccive 

.  Field  Theory 


A.  Theories 

.  Assimilacion-Concrasc 
Theor/ 


3.  Naires 

.  Hoviandf  Janis* 

S  Xelley 
.  Weiss 


B.  ^{amer 

.  Bern 

.  Doob 

.  SCaacs 

.  Scocc 


8.  Names 

.  Abel son 

.  Brehm  &  Cohen 

.  Carcwrighc  &  Harar/ 

.  Feacher 

.  FesCinger 

.  Heider 

.  CecJcy 

.  Lewin 

.  McGuire 

.  Newcoirb 

-  Osgood  S  Tannanbaum 

.  Rolceacli 

.  Rosenberg  s  Selley 


C.  Terminology 

•  accencion#  conprehen*- 

sion/  accepcance, 

recencion 
.  persuasive  coirmunicacion 
.  30*jrce  credibilicy 
.  incencives 
.  ?raccice/.ne  icaX 

rehearsal 
.  effeccive  exc»cacory 

pocencial 
.  ccnr.unicaiicn  c  "Ui. 
.  coi.municacion  eftt^cs 


C.  Tenninology 


.  overt  scimuius#  implicic 
response,  overc  behavior 
.  higher  order  ccndicioning 
.  reinforceTOnC 
.  cues 


C-  Terminology 

.  assercions 
acciCude  ob]ecc 
accraccicns 
balance 

cognicive  eleinencs 
cognicive  relacicns 
cognitive  unic 
coinmiiicacion 
concepcu^li  arena 
congrjiL/ 

equilibrium 

oriencacion 

paycho-logic 

sent imenc  relation 

Socracic  effecc 

scram 

synrnecry 

syscem  of  oriencacion 
criad 

unit  relacion 


B.  Mames 


He 1 son 

Sherif  S  hovland 


C.  Terminology 

.  anchors 

.  assimilation 

.  con erase 

.  lacitude  of  accept etnce 

.  lacitude  of  rejection 

.  level  of  adapcion 

.  level  of  noncomnicmenc 

.  reference  pomes 

.  reference  scales 


A.  Theories 

.  Kacz's  Theory 
.  KeLttan's  Theory 
.  Ps/cnoanalytic 

Theory 
.  Smich/  Bruner/  & 

Whice's  Theory 


B.  Maines 

.  Kacz 

.  Kelman 

.  Smicn#  8runer#  4  Whice 

.  Sarnoff 

.  Stodand 

.  Adler 


C.  Terminology 

.  compliance 
.  ego-de tensive 

function 
.  excernalizacioi 
.  idencificacion 
.  inscrumencal/ 

utilitarian 

funccion 
.  incemalizacion 
.  knowledge  funccion 
.  ob]ecc  appraisal 
.  social  ad]uscnent 
.  value-expressive 

function 
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Table  1.    Maioc  Cacegocies  of  Atcicude  Change  Theories  (Ccncinued) 


I.  Scimulua-Response  ^ 
Behavioriscic  Theories 


Z.  Condicioning  Theories 


3.  Consiscency/Equiiibrium 
Theories 


4.  Social  Judgemenc  Theories    5.  ?unccional  Theories 


0.  Theory  Aspects 

.  Changing  opinions  re- 
soles .n  accicude  change 

.  Opinio IS  change  vhen  a 
sub''ccc  has  incernalized 
3  valuacional  nnessage 

.  Opinions  Changs  wnen  che 
learner  perceives  argu- 
ments to  oe  reasonable 
and  logical 

.  Variaoles  affecting 
acceptance  of  a  commun- 
ication are  (a)  cbser;- 
aole  cnaracceristics  of 
Che  source,  (b)  the 
setting  in  which  the 
cotnnunication  is  pre- 
sentedr  and  (c)  the 
strength  of  the  argu- 
cnents  or  appeals 


D.  Theory  Aspects 

.  ^ew  altitudes  are 
lefi  'ned  through  system- 
atic control  of  con- 
tingencies 


D.  Theory  Aspects 

.  Attitudes  possess  psy- 
chological structure; 
a  change  xn  the  affec- 
tive component  will 
result  in  a  change  in 
the  cognitive  cctnpo- 
nent  and  vice  versa 

.  Irtibalance/strain/inccn- 
gruity/dissonanre  result 
in  tension  wh'.ch  forces 
a  change  towa.rd  equil- 
ibrium 

.  Inconsistencies  are 
'-dcovered  throuqn 
tnr.ught 

.  Tersons  who  perceive 
eac'  ^ther  as  similar 
sh-^,d  be  attracted  to 
each  other 

-  Interaction  and  prox- 
imity should  result  m 
positiv  '  attitudes 


D.  Theory  Aspects 

.  An  opinion  that  is  not 
too  discrepant  from 
that  held  by  a  person 
will  be  accepted 

.  An  opinion  that  is  quite 
discrepant  from  that 
held  by  a  person  will  oe 
re]ecc9d 


D.  Theory  Aspects 

,  To  change  attitudes  it 
iS  first  necessary  to 
icnow  the  function  of 
the  attitude  to  be 
changed 

.  Attitudes  change  when 
they  no  longer  serve 
their  original  func- 
tions 


2.  Strategies/Techniques        e.  Strategies/ Techniques       E.  Strategies/Techniques 


present  subject  with  a 
persuasive  conitjunication 
(may  include  neutral 
material ) 

use  experts  or  presti- 
gious persoiis  as  source 
use  counter-cittitudinal 
advocacy  {e.g»,  reading 
a  persuasive  communica- 
cion  •expressing  opinion 
different  from  one's 
own  vith  exnression  and 
convictiof     reading  and 
defending  an  opinion 
different  from  one's 
own  under  forced  com- 
oliance) 


.  use  verbal  reinforcement 
.  use  classical  condition- 
ing involving  external 
stimuli 


.  employ  a  person  whom 
subject  respects  to 
present  communication 

.  associate  person r  event/ 
or  idea  with  person, 
event r  or  idea  subject 
respects 

.  present  opinion  as 
being  similar  to 
opinion  held  by  sub;]ect 

.  demonstrate  to  3ub]ect 
that  attitude  object  is 
relevant  to  the  attain- 
ment of  certain  values 

.  assist  subject  to 
recognize  conflicts 
within  his/her  valine 

.  employ  prestigious 
gi:g70st'.':n'' 

.  employ  counter-attitud- 
inal  acts  (e.g.r  role 
playing,  simulation) 

.  present  the  disabled 
as  being  "normal* 

.  provide  contact  with 
the  disabled  so  that 
respect  can  develop 

.  peer  tutoring 

.  cooperative  learning 


E,  Strategies/Techniques 

.  measure  a  person's 
point  of  view  and 
then  present  a 
persuasive  ccmmun- 
icat:on  that  falls 
just  outside  the 
point  of  view 


E.  Strategies/Techniques 

.  provide  new  infor- 
mation and  arguinsnts 
to  show  that  atti- 
tude IS  not  useful 

.  remove  perceived  i 
threat  associated 
with  at**itut> 

.  identify  more 
appropriate  values/ 
attitudes 

.  show  chat  accep- 
lv\ce  of  attitude 
IS  not  the  test 
.neems  of  achieving  i 
social  reward 

.  show  that  accep- 
tance of  attitude 
IS  not  the  best  way 
to  achieve  impor- 
tant values 
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of  authors  who  are  associated  with  the  four  categories  of  attitude  change 
theory;  (3)  common  terminology  within  each  of  the  four  attitude  change 
theory  categories;  (4)  a  listing  of  primary  aspects  of  theories  that  fall 
within  each  category;  and  (5)  a  list  of  the  particular  strategies  and 
techniques  for  change  that  are  likely  to  be  used  by  persons  operating  from 
each  theoretical  position. 

C.2.b.  Relationship  to  treatment.  In  some  cases,  articles  will  mention  a 
theory  but  not  go  on  to  draw  any  connection  to  the  treatment  or  study.  That 
should  be  coded  "1".  If  the  theory  is  discussed  only  briefly  in  a  sentence 
or  two  as  a  basis  for  the  treatment,  code  "2".  If  the  treatment  and  study 
are  w^ll  related  to  the  theory,  code  "3".  If  the  author  refers  to  the  theory 
only  in  interpreting  the  results  obtained,  code  "4".  If  a  theory  is  not 
mentioned  and  you  must  infer  it,  code  "5". 

If  the  treatment  involves  mainstream ing  only  ( "mainstream ing"  defined  as 
a  systematic,  sustained  effort  to  integrate  disabled  students  in  regular 
classrooms  for  part  or  all  of  their  instruction,  as  contrasted  with  bringing 
disabled  students  into  a  regular  classroom  temporarily  to  provide  contact  as 
part  of  a  research  project),  it  will  be  coded  in  Section  CIO.  In  that  case, 
X-out  the  coding  spaces  for  Sections  C.3.-9.  If,  however,  in  preparation  for 
mainstreaming,  there  Is  a  separate,  codable  research  effort  to  change  the 
attitudes  of,  for  example,  students,  teachers,  administrators,  or  parents, 
code  the  ESs  for  that  part  of  the  report  in  Sections  C.2.-9.,  and  X-out  the 
coding  spaces  for  Sections  C.2.-9.  for  the  mainstreaming  ESs. 

C.3.  Setting.  The  question  here  is,  "Where,  specifically,  was  the  researcn 
conducted?"     If  the  research  was  carried  out  in  an  educational  context 

^\ 
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(numbers  1,  2,  3f  and  4  in  category  B.4,)  as  part  of  regular  instruction 
(number  1,  2,  or  4  in  category  G.2.)/  code  "1",  unless  it  is  clear  that  the 
students  were  not  in  a  classroom  for  the  treatment.  Also,  if  tlie  treatment 
is  a  course  or  part  of  a  course  and  the  experimental  Ss  are  compared  against 
control  Ss  in  another  course  or  class,  code  "1"  for  the  control  group,  too. 
It,  however,  the  control  Ss  are  simply  tested  and  are  not  part  of  a 
structured  activity,  such  as  a  course,  code  "13",  If  the  setting  was  a 
gymnasium  used  for  regular  instruction,  code  "I",  "10=recreation  facility** 
is  to  be  coded  for  community  recreation  buildings,  "4=institution"  refers  to 
a  building  or  set  of  buildings  housing  disabled  persons  on  a  permaneit  basis, 
"5=group  home"  refers  to  a  situation  where  a  few  disabled  persons  (usually  4 
to  5)  live  in  a  home-like  setting,  A  hospital,  "6",  is  a  facility  where 
medical  treatment  is  provided.  If  a  permanant  living  facility  for  mentally 
retarded  persons  is  referred  to  as  a  "hospital",  code  it  as  "4"  nevertheless, 
A  laboratory  ("11")  is  a  special  facility  for  conducting  research,  not 
typically  used  for  other  purposes,  "Individual/small  group"  ("12")  refers  to 
situations  in  which  there  is  inceroction  between  a  small  number  of  people, 
such  as  a  counselor  and  parents,  or  a  few  parents  in  a  discussion  group.  If 
such  interaction  is  the  case,  code  "12"  even  though  the  discussion  may  take 
place  in,  for  example,  a  corner  of  a  large  classroom,  "13=normal  life"  is 
used  to  code  studie^i  in  which  some  experience  or  intervention  is  provided  as 
the  subjects  go  about  their  daily  activities.  For  example,  if  students  were 
asked  to  simulate  disabilities  as  they  went  about  campus  or  shopped  in 
stores,  or  people  were  shown  a  film  at  a  Lion's  Club  meeting,  or  as  a  short 
feature  at  a  movie  theater,  code  '13", 
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Treatment/ intervention  technique(s).    The  major  categories  for  coding 
(  treatments  are:  "2"/  information/   in  which  the  technique  is  to  provide 

information  about  disabled  persons  as  a  means  of  modifying  attitudes;  "2"/ 
direct  contact/  where  subjects  are  put  in  personal  contact  with  disabled 
persons;  "4"/  vicarious  experience/  in  which  the  technique  \s  to  create 
situations  which  will  lead  the  subjects  to  put  themselves  in  the  place  of 
handicapped  persons  and  feel  what  it  is  like  to  be  handicapped*,  "5"/ 
reinforcement/  in  which  either  classical  or  operant  conditioning  is  used  to 
modify  behavior  assumed  to  reflect  attitudes;  and/  "6"/  persuasive  message/ 
in  which  a  message  designed  with  an  argument  intended  to  convince  people  as 
to  what  their  attitudes  toward  disabled  people  should  be  is  presented.  If 
the  purpose  is  not  to  present  an  argument/  but  to  test  the  efficacy  of 
presentation  of  information  through  a  medium  (e.g./  a  film)/  code  "2",  If 
the  study  investigated  the  relative  effectiveness  of  different  messages  or  of 
using  different  media  to  present  a  persuasive  message/  code  *'7=persuasive 
messages/  contrast".  Some  ir*wOrventions  may  be  a  combination  of  these 
techniques/  and  that  should  be  coded»  For  conbinations  not  encompassed  in 
"8"  or  "9"/  code  "10"  ("other")  and  specify  tne  combination.  For  studies  in 
which  the  treatment  is  systematic  desensitizatio*:  (which  might  involve 
exposure  to  disabled  persons  in  imagination  or  through  direct  contact)/  code 
"11".  Typically  in  that  cese,  "2"  ("conditioning")  will  have  been  coded  in 
C.2.a.  Attitude  Change  Theory.  When  "11"  is  coded/  there  may  be  no  coding  in 
C4.a-e.  Code  "a.  Information"  only  if  conveying  information  is  intenti  al 
and  not  incidental.  Do  code  "b.  Direct  con  tat:  t"  if  it  is  part  of  the 
desensitization  process.  Code  "d.  Positive  reinforcement"  only  if 
reinforcement  is  used;  i.e./  do  not  code  in  that  category  if  only  extinction 
is  used.  •  ^  ^  ^ 
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Usually  authors  will  indicate  which  technique  they  intended  to  use.  In 
some  cases,  you  will  need  to  make  a  best  judgment  based  on  the  information 
provided.  The  main  point  is  to  categorize  studies  by  their  intended 
intervention  techiniques.  For  example/  if  the  intended  technique  was  direct 
contact  and  the  subjects  were  put  in  contact  with  disabled  persons  through  a 
panel  of  speakers  or  through  tutoring  disabled  peors,  even  though  it  is 
likely  that  they  would  have  picked  up  information  about  disabled  persons, 
"2"  should  not  be  coded  here  (although  it  will  be  in  4. a.  below).  On  the 
other  hand,  if  a  research  report  states  that  the  purpose  was  to  convey 
information  about,  as  well  as  to  put  the  subjects  in  contact  with,  disabled 
persons,  and  a  panel  was  used  for  that  purpose,  then  "8",  indicating  a 
combined  information-direct  contact  technique,  should  be  coded. 

C.4.a.  Information.  (1)  Type.  If  the  intervention  did  not  provide  informa- 
tion about  disabled  persons,  then  "0"  should  be  coded.  That  will  be  the  case 
for  control  and  placebo  groups,  and  for  many  studies  scored  in  any 
treatment/intervention  category  other  than  "2"  in  C.4.  However,  if  it  is 
evident  that  a  treatment  designed  with  a  different  intent  doeb  include 
information  about  disabled  persons,  select  the  appropriate  code. 

"l=etiology"  refers  to  information  about  the  causes  of  disabilities. 
"2=characteristics"  refers  to  information  about  the  characteristics, 
including  abilities  as  well  as  disabilities,  aspirations  and  interests, 
vocational  and  social  capabilities,  and  the  kinds  of  activities,  such  as 
sports,  in  which  disabled  persons  car  engage.  "3=problems"  refers  to 
information  about  the  difficulties  encountered  by  disabled  persons,  including 
learning  problems  encountered  by  learning  disabled  students.  "4"  refers  to 
programs  in  which  the  characteristics  of  disabled  persons  are  presented,  with 
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emphasis  on  the  similarities  with  nondisabled  persons.  "7"  refers  to 
information  about  the  legal  rights  of  disabled  personsc  "8" 
("parenting/management")  refers  to  general  information  about  being  a  parent 
or  managing  a  classroom  or  business.  "9"  ("self")  refers  to  providing  J'^  Ss 
with  information  about  themselves,  e.g.,  how  their  reactions  compare  to  those 
of  other  parents  or  teachers.  "10"  ("social  relations")  refers  to  general 
information  about  how  people  relate  to  one  another  in  group/social  settings 
and/or  how  one  might  relate  to  disabled  persons  in  such  settings.  For  a 
regular  course  or  program  (see  C.4.a.  (2)  convention)  for  which  the  content 
is  not  specified,   code  "12"  and  write  '..i  "can't  tell". 

C.4.a.  (2)  Delivery  mode.  "O=none"  would  be  coded  for  control  groups.  "1", 
"lecture"  refers  to  lecturing  by  a  regular  instructor,  while  "7"  and  "8" 
refer  to  the  use  of  guest  lecturers  or  speakers.  "4",  "print"  refers  to  use 
of  textbooks,  research  reports,  other  expository  types  of  materials.  Case 
studies  presented  in  print  are  coded  "11".  If  a  film  or  other  material  with 
an  identifiable  title  is  used,  for  the  treatment  and/or  placebo  group,  write 
the  title  here  on  the  coding  sheet  and  on  a  COMMENTS  ON  STUDY  sheet  so  that 
the  information  can  be  retrieved  easily  later,  if  needed.  Simulations  may  be 
used  primarily  to  present  information  rather  than  to  evoke  vicarious 
experiences.  If  that  intent  is  clear  in  the  report,  code  H3".  I£  the 
effects  of  a  regular  course  (e.g.,  an  abnormal  psychology,  introductory 
special  education  course,  or  practicum  experience)  or  program  (e.g.,  a 
master's  program  for  rehabilitation  counselors  or  a  nurses'  training  program) 
with  no  special  treatment  (e.g.,  a  specially  selected  film,  special  speakers) 
is  investigated,  code  "14"  or  "15",   respectively.     Indicate  what  is  reported 
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about  the  mode  or  modes  of  delivery  or  write  in  "can't  tell"  if  no  specifics 
are  given. 

C.4.b.  Direct  contact.  Each  of  the  direct  contact  categories  encompasses  a 
means  by  which  contact  with  disabled  persons  can  be  provided.  "1"  ("as 
companions")  refers  to  time  spent  with  a  disabled  person(s)  without  the 
structure  implied  by  other  categories,  such  as  peer  tutoring,  the  classroom, 
or  supervised  playground  activities.  "3"  ("cooperative  learning  groups") 
refers  to  a  specific  teaching  approach  in  which  students  work  tog-.ther  in 
groups  to  attain  common  goals,  minimizing  competition  among  students  for 
grades  and  other  teacher  rewards.  "5"  ("as  classmates,  behavior  modifieP) 
refers  to  direct  contact  with  disabled  persons  who  have  gone  through  a  social 
skills  or  academic  skills  training  program  of  some  sort  in  order  to  make 
their  behavior  more  acceptable  to  their  nondisabled  peers.  That  is  true  also 
for  "6",  except  here  the  concern  was  with  making  disabled  students  mora 
acceptable  to  their  teachers.  "0"  is  coded  for  :x5ntrol  or  placebo  groups,  or 
for  studies  in  which  direct  contact  is  not  an  evident  part  of  the  treatment. 

C,4.c.  Vicarious  experience.  Again,  code  "0"  for  control  groups,  placebo 
groups,  and  studies  in  which  vicarious  experience  is  not  an  evident  part  of 
the  intervention.  "1"  refers  to  situations  in  which  the  subjects  are  asked 
to  role-play  situations  in  which  chey  have  contact  with  disabled  persons;  "2" 
refers  to  situations  in  which  the  persons  actually  role-play  being  disabled. 
Role-playing  involves,  as  the  term  "role"  implies,  interacting  with  other 
f arsons  in  an  acting-out  situation.  "3",  simulation,  involves  the  use  of 
devices  to  simulate  a  disability,  such  as  using  a  wheelchair,  having  one's 
hearing  dampened,    wearing  a  blindfold.     Projects  may  also  attempt  to  evoke 
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vicarious  experiences  through  having  subjects  observe  others  who  are  role- 
playing  or  undergoing  a  simulation  ("4")/  through  viewing  video  tapes  or 
films  ("5")/  through  reading  case  studies  ("6")/  through  looking  at  pictures 
and/or  photographs  ("7")/  through  reading  works  of  fiction  or  biography 
("8")/  or  through  playing  with  disabled  dolls  or  watching  plays  put  on  with 
disabled  puppets  ("9").  If  a  technique  to  provide  a  vicarious  experience 
cannot  be  reasonably  categorized  in  the  first  eight  categories,  categorize  it 
as  "other"  and  describe  it  briefly.  If  the  materials  used  to  provide 
vicarious  experiences  have  an  identifiable  title,  write  it  here  on  the  coding 
sheet  and  on  a  COMMENTS  ON  STUDY  sheet  for  possible  later  retrieval. 

C,4.d.  Positive  reinforcement.  If  the  intent  is  to  k'iep  subjects  unaware 
that  they  are  being  reinforced,  then  "1"  should  be  scored.  If  there  is  no 
intent  to  conceal  reinforcement — for  example,  students  are  given  tokens  based 
on  the  number  of  times  or  length  of  time  they  play  with  disabled  students — 
then  score  "2".  Use  "0"  for  control  groups  and  for  studies  with  no  obvious 
reinforcement. 

C.4.e.  Persuasive  message.  If  use  of  a  persuasive  message  was  not  part  of 
the  study/  or  if  a  control  group  is  used  in  a  persuasive  message  study/  code 
"O".  "Expert"  (M")  refers  to  one  who  has  credibility  as  knowledgeable  about 
disabilities  or  some  other  area  relevant  to  the  treatment  (e.g.,  attitude 
theories).  "6"  refers  to  the  persuasive  message  strategy  in  which  a  subject 
is  asked  to  present  a  message  in  a  way  so  as  to  convince  somelx>dy  else,  with 
the  intent  that  ^hey  will  thereby  persuade  themselves.  When  the  comparative 
effects  of  two  different  messages  are  studied,  code  "7". 
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C.5.  Treatment/intervention  to  change  attitudes  toward.  Some  reports  will 
identify  specifically  the  disabled  group  of  concern;  that  should  oe  coded, 
using  the  "other"  ("18")  category  only  if  necessary.  In  some  studies, 
attitudes  toward  disabled  persons  in  general,  rather  than  toward  a  specific 
disability,  will  be  the  target  and  "1"  should  be  coded.  Or,  attitudes  toward 
the  physically  disabled  in  general,  or  toward  those  who  are  "mentally 
retarded",  will  be  the  target;  and  *'2"  or  "3"  should  be  coded,  respectively. 
Educable  mentally  retarded  persons  are  moderately  retarded  ("4")  and 
trainable  mentally  retarded  persons  are  severely  retarded  ("5").  Code  "6" 
("mentally  ill")  or  "?'•  (emotionally  disturbed")  according  to  the  terminology 
used  in  the  report.  "Emotionally  disturbed"  will  usually  be  used  in 
reference  to  students  whose  behavior  creates  classroom  problems.  Include 
"behaviorally  disturbed"  students  in  "7". 

C.6.  Treatment/intervention  conducted  by.  Reference  here  is  to  the  person  or 
persons  whr  carried  out  the  actual  treatment,  including  such  activities  as 
introducing  the  treatment,  for  example,  a  film  or  speaker,  and  participating 
in  the  treatment  if  it  involves,  e.g.,  presenting  information  or  leading  a 
discussion,  as  contrasted  with  a  "nonperson"  treatment  such  as  a  film.  "1" 
("experimenter")  refers  to  a  person  who  is  a  project  director  and  responsible 
for  the  project.  "2"  refers  to  people — i.e.,  research  assistants — hired  to 
work  on  a  research  project.  "3"  refers  to  staff  who  are  regularly  employed 
in  the  setting  where  the  research  is  carried  out.  For  example,  if  regularly 
employed  teachers  carry  out  the  treatment/intervention,  then  "3"  should  be 
scored.  If  the  experiinenter  has  a  role  in  orienting  Ss  and  someone  else 
(e.g./  a  research  assistant)  conducts  the  treatment,  code  "4".  In  studies  of 
naturally  occurring  experiences,  such  as  the  effects  of  contact  as  camp 
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counselors,  no  treatment  is  actually  "conducted",  and  "6",  "not  applicable", 
should  be  coded.  Code  "not  applicable"  ("6")  also  for  control  groups  (where 
no  treatment  is  applied) ♦ 

C.7.  Length  o£  treatment/intervention.  Following  Category  C.7*a.,  you  are 
asked  to  enter  information  in  regard  to  days/we-^k,  minutes/ day,  number  of 
weeks,  and  total  numbers  of  hours  of  treatment  time.  If  inforrriation  on 
treatment  length  is  not  available,  enter  x's.  Length  of  treatment  is 
potentially  an  important  variable,  so  try  to  infer  the  length  if  it  is  not 
given  specifically.  For  example,  if  it  appears  regular  class  periods  were 
used,  assume  they  were  45  minutes  long.  Assume  the  same  for  a  treatment 
session,  unless  there  is  some  counterindication*  For  regular  programs  (e.g., 
graduate  programs,  internships),  it  will  often  be  possible  to  determine  the 
number  of  weeks  (e.g.,  a  quarter  is  assumed  to  be  10  weeks;  a  semester,  15 
weeks)  but  not  the  number  of  hours  per  day  or  perhaps  even  days  per  week. 
Enter  the  data  on  number  of  weeks  and  enter  x's  in  the  spaces  for  which 
information  is  not  available. 

The  length  of  placebo  treatments  should  be  entered  if  available.  For 
control  groups,  "O's"  will  be  entered,  as  they  will  receive  no  treatment. 
When  a  reading  assignment  is  part  of  a  treatnr.Gnt,  do  not  include  that  time  in 
the  length  of  treatment  unless  the  time  Ss  spent  reading  is  clear  from  the 
reporU.    Note  the  reading  on  the  COMMENTS  ON  STUDY  Sheet. 

C.T.a.  Information  available.  Some  reports  will  not  provide  sufficient 
information  for  you  to  determine  the  length  of  the  treatment.  In  that  case, 
code  "0"  and  enter  x*s  in  b.  through  e.  in  the  experimental  and  control 
spaces. 
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C.7.b.  Number  of  days  per  veek>  Number  of  days  per  week  does  not  refer  to 
full  days  but  to  the  number  of  days  during  which  there  is  exposure  to  the 
treatment*  Round  to  whole  numbers.  If,  for  example,  students  participate  in 
a  program  for  four  and  a  half  days,  enter  "5",  Or,  if  the  treatment  is  a 
film  presented  once  (i.e.,  on  one  day)/  enter  "1".  Note  that  a  control 
group  should  be  receiving  no  treatment,  so  O's  would  be  entered,  unless  those 
spaces  are  being  used  for  a  Treatment  B  or  placebo  group. 

C.7.C.  Mini' tes/day.  Enter  here,  if  available,  the  numbe .  of  minutes  per  day 
spent  in  the  treatment.  If  treatment  length  is  discussed  in  terms  of  class 
periods  or  group  presentations,  assume  a  45-minute  period.  For  overnight 
camps,  use  12  hours  is  the  estimate  of  contact  time;  and  for  day  camps,  use  6 
hours.  Estimate  length,  if  possible.  If  length  cannot  be  estimated,  enter 
x's.  With  a  true  control  group,  enter  zeroes  if  information  is  available  for 
the  experimental  group.  If  information  is  not  available  for  the  experimental 
group,   also  enter  x*s  for  the  control  group. 

C.7.d.  Number  of  weeks.  Enter  only  whole  numbers.  If  information  is  given 
in  months,  assume  4.3  weeks  per  ii^onth.  If  the  treatment  is  less  than  one 
week  insert  x's.  Assume  five  days  to  the  week  and  round  as  appropriate  when 
treatment  does  not  encompass  a  full  five  days  but  is  more  than  one  week. 
Follow  the  same  convention  as  in  7.c.  for  entering  information  for  the 
control  group. 

C.7.e.  Total  #  of  hours.  Multiply  the  #  of  actual  days  by  the  #  of  minutes 
per  day  and  divide  by  60  to  obtain  the  total  #  of  hours  of  treatment.  Round 
to  one  decimal  place.  If  there  are  more  than  999.9  hours  in  the  treatment, 
enter  999.9  and  note  the  actual  hours  on  the  COMMENTS  ON  STUDY  sheet. 
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C,8,  Verification  of  treatment  implementatioru  Here  the  question  is  whether 
the  researcher  made  cn  effort  to  verify  whether  the  treatment  was 
implemented  as  intended,  including  whether  the  control  group  actually 
received  no  relevant  intervention  during  the  treatment  as  intended.  That  is, 
was  evidence  gathered  as  to  whether  the  treatment  was  carried  out  with 
fidelity?  If  no  such  verification  is  mentioned,  "0"  should  be  coded.  On 
rare  instances,  it  may  seem  that  no  verification  is  necessary  because  of  the 
way  the  independent  variable  is  defined,  and  "1"  should  be  coded.  Systematic 
observation  ("2")  involves  use  of  a  category  system  to  score  behavior  to 
determine  whether  planned  behaviors  occurred,  "3",  nons tructured 
observation;  involves  more  casual  observation  in  which,  for  example,  the 
researcher  or  an  assistant  might  simply  observe  an  intervention  and  make  a 
judgment  as  to  whether  it  was  implemented  properly,  Subjects  might  be 
interviewed  ("4")  or  be  given  a  questionnaire  ("5")  in  order  to  determine 
whether  they  perceived  intended  variations  in  treatments  as  one  check  on 
whether  those  variations  occurred.  Also,  the  researcher  might  debrief  the 
persons  who  carried  out  the  intervention  to  determine  whether  they  thought 
that  they  had  implemented  it  successfully  ("4=intervener  follow-up).  If  a 
combination  of  verification  techniques  is  used,  score  "7"  and  enter  the 
numbers  for  the  separate  techniques, 

C,8,a,  Reporting,  If  no  verification  was  attempted,  code  "0",  "1"  should  be 
scored  if  data  are  presented  and  simply  referred  to  with  no  csnalysio,  "2" 
should  be  scored  if  data  are  presented  and  some  sort  o*.  analysis — e,g,,  the 
statistical  significance  of  differences  between  proportions  of  behaviors  for 
treatments — is  reported.     If  the  author  or  authors  simply  assert  that  they 
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gathered  data  that  indicated  verification,  or  that  thair  unstructured 
observations  indicated  implementation/  then  score  "3". 

C.8.b.  Degree  of  implementation  claimed.  If  "0"  or  "1"  was  coded  in  C.8., 
code  "0"  here.  Probably  no  author  would  indicate  that  he  or  she  failed  to 
implement  the  t-reatment;  or,  at  least/  no  such  article  would  get  published. 
''l=none"/  is  to  be  scored  if/  despite  an  apparent  effort  at  verification 
treatment/  no  claim  is  made  in  regard  to  implementation.  It  is  also  not 
likely  thai:  "2=some"  will  be  coded  frequently. 

C.8.C.  Basis  for  author's  conclusion  re  implementation.  If  "0**  or  "1"  was 
coded  in  CB.b./  code  "0"  here.  Sometimes  authors  will  claim  that  the 
treatment  was  carried  out/  but  give  no  basis  for  that  conclusion,  in  that 
case  code  "1".  If  the  author  simpl;^  refers  to  a  judgment  made/  for  example/ 
based  on  observations/  code  "2".  If  statistically  significant  differences 
between  treatments — for  example/  in  terms  of  intervene*^  behavior — are 
reported/  code  "3".  If  the  author  simply  reports  data  and  claims  that 
inspection  of  the  data  confirms  verification/  code  "4". 

C.B.d.  Actual  i m pi e men ta t i on.  Here  you  must  make  a  judgment/  based  on  your 
best  reading  of  the  report/  as  to  how  well  the  treatment  was  implemented. 

C.S.e.  Description  of  treatment  adequate  for  replication.  The  p  )int  of 
concern  he  is  not  whether  the  treatment  could  possibly  be  carried  out  by 
someone  else  interested  in  changing  attitudes  toward  disabled  persons/  but 
whether  another  researcher  interested  in  repeating  the  study  would  have 
enough  information  about  the  treatment  to  repeat  it  with  fidelity.  The 
question  to  ask  is,  if  I  were  to  attempt  to  repeat  this  study/  how  confident 
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would  I  be  that  I  could  carry  out  the  treatment  as  it  v/as  conducted  tlie  first 
tinne? 


C.9*  Treatment  validit^>  Even  if  a  treatment  is  implemented  with  fidelity/ 
the  subjects  may  not  experience  the  treatment,  intervention,  or  conditions 
as  intended  by  the  researcher.  The  subcategories  here  get  at  various 
dimensions  of  treatment  validity/  of  which  fidelity  of  implementation  is  one. 
The  question  is,  if  the  treatment  had  an  effecc/  or  failed  to  have  an  effect/ 
was  the  outcome  likely  due  to  the  treatment  as  intended  or  due  to  associated/ 
but  unintended/  aspects  of  the  treatment? 

If  secondary  ESs  involve  within-gender  treatment  comparisons,  use  the 
coding  for  the  treatment  main  effect  ESs  unless  there  is  reason  to  think 
validity  differed  by  gender.  For  interaction  secondary  ESs,  make  an  overall 
judgment  of  validity  for  each  category  in  C.9. 

Many  times  authors  of  research  reports  will  not  provide  the  information 
necessary  to  determine  whether  treatment  validity  is  threatened.  For 
example/  often  no  evidence  will  be  reported  to  verify  that  the  treatment  was 
presented  as  intended.  Or,  the  author  will  not  address  the  question  whether 
the  Hawthorne  effect  or  some  other  effect  might  have  been  an  unintended  part 
of  the  treatment  or  intervention.  If  there  is  a  lack  of  information  as  to 
whether  a  threat  was  present/  code  "0=can't  tell".  At  the  same  tinie,  you 
should  use  your  common  sense  and  knowledge  of  intervention  settings  to  decide 
whether  a  threat  is  plausible.  Cases  may  arise  where  a  category  of  treatment 
validity  does  not  apply.  For  example/  in  coding  a  pre-post/  single-group 
study/  "c.  John  Henry"  would  not  be  relevant.    In  puch  cases/  code  "5". 

C.9.a.  Implementation.  Enter  here  the  number  which  you  coded  in  C.8.d./ 
Actual  Implementation,  above. 
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C.9.b.  Hawthorne^  The  Hawthorne  effect  refers  to  influences  on  the  behavior 
of  subjects  simply  because  they  are  aware  that  they  are  partic '"^ting  in  an 
experiment.  Such  an  effect  might  occur,  for  example,  when  students  in  a 
psychology  class  mvst  volunteer  to  participate  in  an  experiment  for  class 
credit,  when  it  is  announced  to  subjects  that  they  are  part  of  a  research 
project*  or  when  Ss  are  brought  into  a  laboratory  for  special  treatment.  On- 
the  other  hand,  if  there  is  no  reason  to  think  that  students  would  be  aware 
that  they  are  part  of  an  experiment — for  exa'nple,  a  film  is  presented  by 
their  teacher  as  a  regular  part  of  course  instruction — then  code  "1", 
indicating  that  there  was  no  plausible  threat. 

C.9.C.  John  Henry,  The  John  Henry  f=iffect  occurs  when  members  of  the  control 
group  recognize  that  a  treatment  group  is  getting  special  treatment,  and 
"work  harder"  in  order  to  show  that  they  can  perform  as  well  without  the 
special  treatment.  It  is  probably  not  a  common  factor  in  attitude  research, 
as  contrasted  with  research  where  some  sort  of  achievement  is  assessed  as  evn 
outcome.  Foe  pre-post,  single-group  designs,  enter  a  "5",  as  "can't  tell" 
is  not  pertinent  and  "1"  would       a  positive  design  indicator. 

C.9.d.  Treatment  diffusion.  Sometimes  in  educational  and  psychological 
research  the  subjects  in  different  treatment  groups  will  communicate  with 
each  other,  so  that  a  control  or  placebo  qroup  becomes  aware  of  and/or 
knowledgeable  about  the  treatment,  thereby  "washing  out"  the  planned 
differences  between  the  groups.    Code  "5"  for  a  pre-post,  one-group  design. 

C.9.e.  Dissatisfaction/resentment.  Groups  of  subjects  who  perceive  that  they 
are  receiving  a  less  desirable  treatrnent,  a  more  demanding  treatment,  or  an 
overly  demanding  treatment  may  become  dissatisfied  and/or  resentful,  adding  a 
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characteristic  to  the  indepei^dent  variable  not  planned  by  the  researcher. 
Disruption  by  the  treatment  that  causes  resentment  wculd  be  scored  here; 
disruption  that  makes  the  treatment  difficult  for  the  intervener  to  handle  is 
coded  in  C.9.f. 

C.9.f.  Novelty/disruption.  If  an  intervention  is  new  and  exciting  to  the 
students/  or  if  it  'disrupts  daily  routines/  that  may  be  an  unplanned 
treatment  characteristic.  However/  if  novelty  is  a  planned  aspect  of  a 
short-term  intervention — for  example/  that  it  is  unusual  to  be  around 
disabled  persons — then  it  is  not  a  threat  to  the  treatment  validity. 
Particularly  if  the  treatment  is  a  one-shot  affair/  novelty  may  be  an 
important  planned  characteristic.  However/  if  the  treatment  is  intended  to 
be  used  over  a  long  period  of  time  (e.g.?  as  a  regular  part  of  a  class)/  then 
novel ty/disrupticn  becomes  a  concern  in  answering  the  basic  question  for  this 
category/  Would  the  effects  hold  up  if  the  subjects  were  exposed  to  the 
treatment  for  a  longer  ptrioc    f  time  than  reported  in  the  study? 

C.9.g.  Experimenter  ef feet/expectations^  Here  the  question  is  whether  as  a 
part  of  presentation  of  the  intervention/  experimental  expectations  were 
conveyed  in  such  a  way  that  those  expectations  become  part  of  the  treatment/ 
affecting  the  results.  Experimenter  effects  will  usually  be  a  concern  when 
the  researcher  conducts  the  treatment.  The  concern  will  be  especially  great 
when  there  are  no  checks  on  the  way  in  which  the  treatment  is  presented/  on 
how  the  experimenter  is  perceived/  or  on  whether  expectations  are  projected 
to  the  subjects  (sucn  as  when  the  introduction  of  stimuli/  such  as  films/ 
might  be  affected  by  the  researcher^s  knowledge  that  he  or  she  is  presenting 
to  a  treatment  or  placebo  group). 
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C.9.h.  Troatment/experimenter  confoundeoL  When  an  intervention  is  carried 
out  by  only  one  person  or  very  few  persons/  the  treatment  is  confounded  with 
the  personal  characteristics  of  the  intervener — for  example/  the  enthusiasm 
or  lack  of  enthusiasm  of  the  intervener.  This  category  is  related  to 
experimenter  effects.  However/  that  category/  as  used  here/  refers  to 
conveying  expectations  based  on  knowledge  of  or  desires  for  the  research. 
The  concern  here  is  that  any  intervener/  experimenter  or  not,  brings  personal 
attributes  to  the  intervention  situation/  regardless  of  their  knowledge. 
Unless  these  specific  characteristics  are  "averaged  out"  over  a  number  of 
interveners/  the  treatment  effects  cannot  be  separated  from  the  personal 
characteristics  of  the  intervener. 

C.9.i.  Test  by  treatment  interaction.  Part  of  the  Ss  experience  with  the 
treatment  may  be  due  to  the  fact  that  they  were  sensitized  by  the  content  or 
methodology  of  a  pretest  or  a  posttest.  In  that  case/  the  testing  experience 
becomes  a  part  of  a  treatment  not  anticipated  by  the  researcher/  and  a  threat 
to  treatment  validity.  This  category  is  related  to  the  later  one,  D.lO.d. 
Reactivity  of  measure/  and  should  be  scored  in  that  context.  That  is,  the 
more  reactive  the  test,  the  more  likely  it  is  that  there  was  test-by- 
treatment  interaction. 

C.9.j.  Multiple  treatment  interference.  There  are  two  types  of  multiple 
treatment  inte^rference.  One  occurs  when/  within  a  research  project/  the 
subjects  are  exposed  to  more  than  one  treatment;  the  ocher  occurs  when 
subjects  have  participate^"^  in  one  or  more  prior  ret>earch  projects  which 
affect  their  reactions  to  the  treatment  in  the  reported  project.  In  both 
cases/  then,  the  treatment  experienced  by  the  Ss  is  not  what  the  researcher 
intended. 
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C.9.k.  General  Treatment  Validity^  Given  the  conception  of  a  treatment  (the 
treatment  construct)  which  the  researcher  had  in  mind/  what  was  the  general 
validity  of  the  treatment  as  carried  out?  Make  your  judgment  based  on  your 
coding  of  categories  C.9.a.  through  C.9.,j.  above.  Although  it  is  difficult  to 
specify  a  particular  numerical  formula  for  combining  those  ratings  to  arrive 
at  a  General  Treatment  Validity  score/  an  excellent  rating  ("1")  would  mean 
that  most  of  the  sub-categories  received  "I's"  (not  a  plausible  threat)  with 
not  more  than  one  or  two  "can't  tell"  ("0")  ratings;  if  any  minor  problems 
("2")  or  substantial  problems  ("3")  are  indicated/  the  construct  validity 
should  be  rated  "fair"  ("2");  if  there  are  more  than  two  or  three 
substantial  problems  coded  and/ or  one  or  more  major  problems  ("4")/  then 
"poor"  ("3")  should  be  coded. 

0.10.  MAINSTREAM ING.  As  indicated  in  C.l./  mainstreaming  as  a  treatment/ 
intervention  technique/  in  which  disabled  students  are  integrated  into 
regular  classrooms  for  one  or  more  periods  a  day  (see  also  C.2.b./  p.  18)/  is 
coded  separately  in  this  section.  If  research  on  the  use  of  other 
techniques  to  modify  attitudes  toward  disabled  persons  prior  to  mainstreaming 
is  reported/  those  treatments  should  be  coded  in  the  prior  sections/  and  the 
Es  columns  X'd-out    in  C.IO. 

C.lO.a.  Presence.  The  purpose  of  this  category  is  to  be  able  to  sort  out 
easily  ail  mainstreaming  studies/  whether  scored  in  the  prior  sections  or 
not.  "l=maindtreaming  only"  refers  to  studies  not  coded  in  the  prior 
sections.  If  an  investigation  Qf  changing  attitudes  prior  to  mainstreaming 
was  coded/  code  "2"  here  and  X-out  the  coding  categories  in  Section  dO  for 
these  ESs.  If  research  into  the  effects  of  mainst  ;eaming  was  not  part  of  the 
study/  code  "0"  and  X-out  Sections  C.lO.b.-C.lO.o. 
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C.lO.b.  Type  of  study>  Sometimes  when  mainstreaming  is  implemented  in  a 
school  or  school  district,  a  research  study  will  be  pla.^ned  as  part  of  the 
implementation*  In  that  case,  code  "l=planned".  In  some  cases,  once  the 
mainstreaming  is  in  place,  someone  decides  that  it  would  be  interesting  to 
gather  information  on  the  attitudes  of  mainstreamed  vs.  non-mainstreamed 
children/  even  though  it  had  not  been  planned  prior  to  the  mainstreaming.  In 
that  case,  code  "2=post  hoc".  In  terms  of  design,  post  hoc  studies  will 
typically  be  scored  as  static  group  designs  in  the  Internal  Validity  section 
(E.l./  p.  13).  Remember,  if  a  post  hoc  mainstreaming  si'-.udy  is 
correlational — that  is,  it  does  not  involve  comparisons  between  groups  of 
students  categorized  as  having  been  involved  in  mainstreamed  vs. 
nonmainstreamed  programs,  but  instead  attempts  to  obtain  indicators  of  amount 
of  contact  and  correlates  amount  of  contact  with  attitudes — then  it  does  not 
fall  within  the  defined  population  of  studies  for  this  meta-analysis.  If 
you  cannot  tell  from  the  report  whether  the  study  was  pre-planned  or  post 
hoc,  code  "0=can't  tell". 

C. lO.c.  Instruction  in  mainstream  classes.  If  you  cannot  tell  what  type  of 
classroom  instruction  was  used,  code  "0".  (It  is  assumed  that  there  was 
instruction  in  an  integrated  classroom  or  the  study  would  not  be  coded  under 
"mainstreaming".)  Sometimes  mainstreaming  will  occur  with  no  particular 
adaptations  in  the  typical  group-class  instruction.  If  so,  code  "1".  If 
there  is  no  indication  of  any  special  modification  in  classroom  instruction, 
"1"  should  be  coded.  If  classroom  instruction  is  conducted  using  what  is 
called  "cooperative  learning",  in  which  students  work  in  groups  and  do  not 
compete  with  one  another  for  grades  and  other  teacher  rewards,  code  "2".  If 
individualized  instruction  or  peer  tutoring  is  used,    code  "3"  or  "4", 
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respectively.  "5"  should  be  coded  for  any  combinations  of  the  above 
instructional  approached.  With  a  "5"/  be  sure  to  specify  what  the 
combination  is,  writing  in  the  numbers,  including  "6"  if  an  "other" 
instructional  approach  is  used. 

C.lO.d.  Special  personnel  support.  Often,  as  part  of  mainstreaming,  teachers 
will  be  given  special  support:  Some  sort  of  preparatory  workshop  is  provided 
to  acquaint  them  with  the  characteristics  of  disabled  students  and  how  best 
to  teach  them  (code,  "2");  or,  consultants  are  provided,  such  as  special 
education  specialists,  to  help  teachers  cope  with  any  problems  which  might 
arise  (code,  "3").  Usually,  any  special  support  will  be  mentioned.  If  it  is 
not,  code  "l=none",  unlesi.  there  is  some  ambiguity  in  the  report,  making 
"0=can't  tell"  applicable.  Remember  that  this  category  refers  to  special 
support.  If,  for  example,  disabled  students  continue  to  be  in  a  special 
education  classroom  or  resource  room  during  the  day  as  they  would  have  been 
without  mainstreaming,  that  is  an  on-going  program  component,  not  special 
support. 

C.lO.e.  Special  skills  training  for  disabled  students.  In  order  to 
facilitate  interactions  between  disabled  students  and  nondisabled  students  in 
a  mainstreaming  program  to,  hopefully,  thereby  improve  the  attitudes  of 
others  toward  the  disabled  students,  special  social  or  academic  skills 
training  may  be  provided  to  the  disabled  students  as  part  of  the 
mainstreaming.  If  one  or  the  other  or  both  types  of  skill  training  are 
reported,  code  "2",  "3",  or  "4",  as  appropriate.  Academic  skills  training 
will  typically  be  a  regular  part  of  special  education;  code  it  here  only  if 
something  special  has  been  added  to  enhance  attitude  change  on  the  part  of 
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nondisabled  students  in  the  mainstreamed  setting.  If  such  training  is  not 
mentioned/  score  "l=none".  If  there  is  some  indication  that  such  training 
might  have  occurred/  but  you  cannot  tell  for  sure,  code  "0=can*t  t-  H". 

C.lO.f.  Type  of  special  skills  training.  If  a  study  was  coded  in  C.lO.d.  as 
"1"/  not  providing  training  in  social/  academic/  or  both/  types  of  skills/  or 
as  "0"/  then  code  "0"  here.  Otherwise;  code  the  type  of  skills  trr.ining  that 
was  provided.  Coaching  ("1")  involves  providing  instructions  tc  disabled 
students  on  how  to  behave  while  they  are  interacting  or  following  interaction 
with  other  persons  in  a  real  or  a  role-playing  situation.  Modeling  ("2") 
involves  having  the  disabled  student  watch  other  persons/  either  in  person  or 
via  a  mediuir-  such  as  a  film/  perform  appropriate  behaviors.  Counseling 
("3")  involves  one-to-one  or  small  group  sessions  in  which  feelings  and 
appropiate  reactions  are  discussed.  The  use  of  reinforcement  to  make 
behavior  more  appropriate  might  be  applied  either  to  the  individual  disabled 
student  ("4")  or  through  the  manipulation  of  group  contingencies  ("5")  in 
which/  for  example/  the  whole  group  would  be  rewarded  only  if  all  individual 
members  behaved  appropriately.  Diagnostic/prescriptive  training  ("6")  refers 
to  a  process  which  is  particularly  likely  to  be  applied  to  academic  skills* 
It  involves  careful  diagnosis  of  the  disabled  student's  learning 
difficulties/  with  a  prescription  for  specific  remediation  or  action. 
Because  diagnostic/prescriptive  teaching  is  often  a  regular  part  of  special 
education  that  disabled  students  are  likely  to  be  receiving  anyhow  if  they 
spend  part  of  their  day  in  a  special  classroom  or  resour  e  room/  be  careful 
to  cv.ce  "6"  only  if  it  has  been  added  as  special  training  for  the  purpose  of 
enhancing  noainstreaming.  Cognitive  control  ("7")  refers  to  a  technique  to 
help  students  gain  control  of  their  own  behavior/   for  example/  by  repeating 
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to  themselves  the  steps  for  carrying  out  a  procedure  prior  to  trying  it. 
Sometimes  self-control  of  reinforcement  is  included. 

C.lO.g.  Special  instruction  for  nondisabled  peers>  As  part  of  mainstreaming 
programs/  special  interventions  may  be  provided  for  nondisabled  peers.  That 
should  *De  scored  here.  Note  that  the  categories  are  identical  to  categories 
2  through  6  in  c.'^..  Treatment /intervention  Techniques  (p.  5.).  If  the 
special  instruction  for  or  intervention  with  nondisabled  peers  is  provided 
prior  tO/  and  in  that  sense  is  separate  from/  the  mainstreaming/  it  should 
be  scored  as  a  separate  treatment  in  the  prior  sections.  If  no  special 
training  is  mentioned/  code  "1".  Code  "0"  if  there  is  some  indication  that 
5=pecial  training  may  have  occurred  but  you  cannot  tell  for  sure. 

C.lO.h.  Parent  education.  Sometimes  as  part  of  mainstreaming  programs/ 
interventions  will  be  carried  out  with  parents  to  enhance  their  understanding 
of  mainstreaming  and/or  disabled  children.  If  only  the  parents  of  disabled 
children  are  involved  in  such  a  proc,ram/  code  "2".  If  only  the  parents  of 
nondisabled  children  are  invv  '*^d/  code  "3".  If  both  types  of  parents  are 
involved/  code  "4".  Unless  parent  education  is  mentioned  explicitly/  code 
"1";  code  "0"  if  there  is  some  indication  of  training/  but  it  is  not  cettain. 

C.lO.i.  Type  of  parent  education.  If  "0"  is  coded  in  C.lO.h./  code  "l=not 
applicable".  Otherwise/  code  using  the  prior  definitions  for  treatment 
categories. 

C.lO.  j.  Handicapped  children  in  mainstreaming  classes.  Here  code  the  type  or 
types  of  disabilities    of  the  mainstreamed  disabled  children* 
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C,10,k,  Number  of  minutes  handicapped  children  spend  daily  in  mainstreamed 
class.  If  no  information  is  available  about  the  amount  of  time  spent  in 
mainstreaming  classes  each  day/  enter  x's.  If  the  amount  of  time  is  reported 
in  terms  of  class  periods/  use  an  estimated  45  minutes  pe^  period  to  compute 
the  numDer  of  minutes  per  day, 

C.lO.l,  Number  of  days  per  week  in  mainstreaming  class  periods  Enter  days/ 
rounding  to  a  whole  number  as  necessary.  Unless  there  is  some  indication  to 
the  contrary/  assume  5  days  per  week.  Enter  x's  if  no  information  is 
available, 

C,10,m,  Total  minutes  per  week  in  mainstreaming  class.  Multiply  the  number 
of  minutes  in  C,]0,k,  times  the  number  of  days  in  10,1,  to  obtain  the  figure 
for  this  item/  unless  the  number  of  minutes  per  week  is  stated  specifically 
in  the  report*    Enter  x's  if  the  information  is  not  available, 

C,10,n,  Months  in  mainstreamed  program  when  outcomes  e  sessed.  Here  the 
question  is,  how  long  had  the  students  been  involved  in  the  mainstreaming 
program  when  attitudes  toward  disabled  persons  were  assessed?  The  exposure 
during  the  school  year  is  of  interest.  Assume  a  9-month  school  year,  So/ 
for  example/  if  the  repeat  indicates  that  mainstreaming  had  been  in  effect 
for  three  years/  multiply  3  x  9  =  27  to  get  the  total  number  of  months — 
unless  the  report  includes  a  more  precise  indicator  of  the  months.  If  the 
number  of  weeks  is  reported/  assume  4,3  weeks  per  month, 

C,10,o,  Outcome  measured  for.  Indicate  here  the  Ss  for  whom  attitudes  toward 
disabled  persons  were  assessed. 
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D.  DEPENDENT  MEASURES 

Recall  (statement  of  General  Purpose  and  Populations)  that  to  be 
included  in  this  meta-analysis/  a  study  must  include  the  assessment  of 
attitudes  toward  disabled  persons  as  an  outcome  measure.  The  assessment  may 
be  direct  or  implicit/  with  attitudes  indicated/  for  example/  by  choice  of 
playmates/  willingness  to  associate  with  disabled  persons/  or  the  nature  of 
interactions  with  dis?^ -ed  parsons. 

If  a  measure  assesses  only  attitudes  toward  children  or  other  persons 
generally/  rather  than  toward  disabled  persons  specifically/  the  study  does 
not  qualify,  (A  general  scale  of  attitudes  toward  children  would  be 
acceptable  if  used  to  assess  attitudes  specifically  toward  disabled  children 
— as  indicated/  e,g,,  by  instructions  to  Ss — even  if  disabilities  are  not 
mentioned  in  the  items,)  By  the  same  token/  measures  which  assess  attitudes 
toward  mainstreaming  or  some  other  such  intervention  strategy  do  not  qualify/ 
because  scores  on  such  assessments  are  not  clear-cut  indicators  of 
attitudes  towards  disabled  persons.  That  is,  a  person  may  feel  positively 
toward  disabled  persons  and  yet  believe  that  mainstreaming/  especially  under 
certain  circumstances/  is  not  appropriate, 

Also/  behavioral  measures  must  be  clearly  related  to  attitudes  toward 
the  disabled,  A  measure  does  not  qualify  if,  for  example/  interactions 
between  disabled  and  nondisabled  S's  are  scored  or  reported  so  that  only  the 
quantity  of  general  interaction  can  be  ascertained/  with  no  indication  of 
whether  the  interactions  were  positive  or  negative;  or,  if  the  behaviors  of 
disabled  and  nondisabled  students  are  not  analyzed  separately;  or,  if 
interactions  of  nondisabled  students  with  disabled  and  other  nondisabled 
students  are  not  separated  for  analysis. 
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D.l,  "Attitude"  defined.  If  the  ^-aport  does  not  contain  an  explicit 
definition  of  "attitude"  in  its  introductory  sections  or  in  a  Data  or 
Instruments  type  of  section/  code  "O",  Attitudes  may  be  defined  affectively 
("1") — that  is,  in  terms  of  feelings  toward  a  referent;  cognitively  ("2")  — 
that  is,  in  terms  in  beliefs  about/  a  referent,  although  rarely  will  this  be 
the  only  component  of  a  definition;  or,  in  behavioral  terms  ("3")  —  that  is, 
as  a  manifestation  of  affect  and  cognition  in  approaching  or  avoiding  a 
referent.  Most  commonly,  a  combination  of  affect,  cognitive,  and  behavioral 
elements,  will  be  used,  if  any  definition  is  given  at  all, 

D.2.  Number  of  dependent  measures.  Enter  (separately  for  regular  and  ES 
Information  Missing  CODING  INSTRUMENTS)  the  number  of  dependent  measures  that 
are  appropriate  for  the  meta-analysis  (which  may  be  different  from  the 
total  number  of  dependent  measures  for  which  findings  are  reported).  The 
interest  is  in  the  number  of  dependent  measures  for  which  ESs  can  be 
computed,  not  the  number  of  types  of  measures.  For  example,  if  a  semantic 
differential  is  used  with  two  concepts  (e,g,.  Mental  Retardation  and 
Physically  Pisabled),  there  are  two  (2)  dependent  measures.  If  the  specific 
names  of  instruments  are  given,  list  the  names.  If  the  form  of  an  instrument 
is  reported,  list  that,  too.  When  you  list  the  names  of  specific 
instruments,  capitalize  and  underline,  to  indicate  titles.  Also,  list  names 
and  forms,  if  available,  on  the  COMMENTS  ON  STUDY  sheet.  If  names  are  not 
available,  list  the  general  category  of  the  dependent  measure,  such  as 
systematic  observation  of  behavior  or  teacher-made  attitude  scale- 

D,3,  Use  of  common  instrument.  Listed  are  acronyms  for  certain  instruments 
which  are  likely  to  be  used  frequently  in  the  research  being  reviewed  for 
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this  meta-analysis.  If  none  was  used  in  the  study,  code  "0".  The  ATDP  ("1") 
is  the  Attitudes  Towards  Disabled  Persons  Scale.  The  OMI  ("2")  is  the 
Opinions  about  Mental  Illness  Scale.'  The  ATHI  ("4")  is  the  Attitudes  Towards 
Handicapped  .Indi 'iduals  Scale.  The  ATBS  ("5")  is  the  Attitude  to  Blindness 
Scale.  The  MRAI  ("6*')  is  the  Mental  Retardation  Attitude  Inventory.  The 
MTAI,  revised  ("7"),  is  the  Minnesota  Teacher  Attitude  Inventory.  The  MTAI 
will  be  appropriate  for  this  meta-analysis  only  if  it  has  been  revised  so 
that  it  assesses  attitudes  toward  disabled  persons  in  particular,  rather  than 
toward  students  in  general. 

A  measure  commonly  used  in  attitude  studies  is  "3",  the  Rucker-Gable 
Educational  Programming  Scale  (RGEPS).  Respondents  are  presented  with  brief 
descriptions  or  children  who  exhibit  behaviors  one  would  expect  of  mentally 
retarded,  emotionally  disturbed,  or  learning  disabled  children  and  asked  to 
select  the  most  appropriate  educational  placement  for  the  child,  ranging  from 
regular  classroom  placement  to  full-time  special  class  placement.  The 
placement  selection  is  considered  to  be  a  measure  of  the  degree  of  social 
distance  the  teacher  prefers  to  maintain  between  himself  or  herself  and  such 
students  (Home,  1985,  p.  53-54).  As  indicated  in  the  introductory  paragraph 
to  this  section,  such  placement  decisions  may  be  affected  by  other  factors 
than  attitudes  toward  the  disabled.  Nevertheless,  we  will  include  the 
Rjcker-Gable  icale  as  a  dependent  measure  for  this  review  when  researchers 
say  they  are  using  it  to  assess  attiudes  toward  disabled  persons.  It  should 
be  coded  in  D.lO.e.  as  having  low  validity.  However,  If  a  report  indicates 
that  the  RGEPS  v;as  used  to  assess  attitudes  toward  mainstreaming,  or  is 
unclear  as  to  its  use,  it  does  not  qualify  as  a  dependent  measure  for  this 
review. 
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The  CMI  (Custodial  Mental  Illness  Inventory) f  which  assesses  attitudes 
toward  custodial  care  of  the  mentally  ill,  with  some  items  on  mental 
patients,  does  not  qualify  as  a  dependent  measure  for  this  review.  The  AAQ 
and  the  CAQ  (Client  Attitude  Questionnaire)  are  dlso  meaSuCGG  of  attitudes 
toward  working  with  the  mentally  ill  (and,  for  the  CAQ,  toward  etiology),  not 
of  attitudes  toward  the  mentally  ill, 

D.4,  Common  instrument  modified.  On  occasion,  researchers  will  modify  an 
instrument  listed  in  D,3,  for  use  in  a  particular  study,  either  to  be  more 
appropriate  for  a  particular  population  of  subjects  ("2")  or  to  assess 
attitudes  toward  different  disabilities  than  those  mentioned  in  the  original 
instrument  ("3").  If  some  other  modification  is  made,  score  ("4")  and 
indicate  the  type  of  charge.  If  modifications  are  made,  they  will  typically 
be  mentioned:  so  if  the  report  makes  no  mention  of  modificatic  :'S,  code  "1", 
If  "O"  was  coded  in  D.3.,  code  "0". 

D.5.  Source  of  data.  For  this  item  code  the  source  of  the  data  for  the 
particular  instrument  used  to  obtain  the  specific  effect  size.  If  the  data 
were  obtained  from  the  individuals  themselves  as,  for  example,  with  an 
attitude  survey  or  a  sentence  completion  test,  score  "1".  In  some  studies, 
attitude  change  is  gauged  by  asking  people  to  assess  the  attitudes  or 
attitude  change  of  other  individuals.  For  example,  a  teacher  may  be  asked  to 
assess  the  attitudes  or  attitude  change  of  students  in  his  or  her  classrooin. 
Categories  "2",  "3'',  and  "4"  are  for  coding  that  source  of  data.  Observation 
("5")  includes  not  only  scoring  behavior  as  it  occurs,  but  the  scoring  of 
transcripts  of  discussio*":  or  the  scoring  of  written  material,  such  as 
essays,   with  a  coding  system.     A  "non-project  request"  ("6"),  involves 
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obtaining  responses  or  behaviors  in  a  situation  removed  from  the  treatment, 
to  which  it  is  hoped  that  the  Ss  will  see  no  connection.  A  classic  example 
i.s  Rokeach's  use  of  responses  on  his  Value  Survey  to  create  value  dissonance 
in  regard  to  racial  discrimination  on  the  part  of  university  students,  then 
later  having  mailed  to  them  a  solicitation  to  join  NAACP,  with  joining  or  not 
the  dependent  measure. 

D.6.  Type  of  assessment.  A  number  oZ  types  of  instrumentation  are  listed/ 
most  of  which  are  fairly  common.  Peer  assessment/  "5"/  refers  to  having 
individuals  provide  some  sort  of  an  assessment  or  evaluation  of  other  persons 
as  an  indicator  of  attitudes  towards  those  assessed.  For  example/ 
nondisabled  Ss  ?.ight  be  asked  to  list  the  positive  or  negative  attributos  of 
students/  with  the  number  listed  the  dependent  measure.  On  t»ie  other  hand/ 
an  adjective  check  list/  "15"/  provides  students  with  a  list  of  adjectives 
and  they  are  asked  to  check  those  which  apply  to  certain  labels/  such  as 
"mentally  retarded".  "8=systematic  observation"  includes  the  use  of  a  set  of 
categories  to  score  transcripts  of  discussions  or  other  interactions/  as  well 
as  live  behavior/  and  the  coding  of  writing/  such  as  essays/  produced  by  the 
Ss.  For  instruments  that  assess  behavioral  intentions  (e.g./  intent  or 
willingness  to  invite  a  disabled  person  home/  to  volunteer  to  work  with 
disabled  persons)/  code  "17"  and  write  in  "boh.avioral  intention  to  .  .  ."/ 
finishing  with  the  intended  behavior. 

D.7.  Source  of  instrument.  Of  interest  in  coding  this  item  is  where  the 
researcher  obtained  the  instrument.  In  some  cases /  no  "instrument"  is 
involved  in  assessment/  as,  when  following  a  film  or  some  other  treatment/ 
students  are  asked  to  volunteer  to  assist  disabled  persons  in  some  way  and 
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the  percentage  of  volunteering  is  recorded.  In  that  case  "O=not  applicable" 
should  be  coded.  In  some  cases,  nc  mention  of  a  source  is  made,  and  "1" 
should  be  coded.  Instruments  developed  by  ceachers,  individually  or  with 
little  or  no  specialized  assistance,  are  to  be  coded  "2",  while  an  instrument 
developed  as  part  of  a  particular  research  project  would  be  coded  "3".  If 
one  of  the  "common  instruments"  (D.3.)  was  used,  or  if  the  instrument  was 
developed  or  used  in  prior  research,  code  "4".  When  a  semantic  differential 
is  used,  with  reference  to  Osgood,  Suci,  and  Tennenbaum  or  a  similar  source, 
Uwing  Osgood  et  al.'s  adjective  pairs  with  terms  (such  as  "mental 
retardation"  or  "mental  patient")  selected  for  the  project,  code  "4".  Only 
code  "3"  if  the  researcher  developed  his/her  ov;n  cet  of  adjective  pairs.  If 
an  instrument  was  modified  for  use  in  the  study  being  coded,  code  "5"  and 
specify  "instrument  modified". 

D.8.  Development  by  project.  If  "2",  "3",  or  "5"  ("instrument  modified")  is 
coded  in  D.7.,  then  the  adequacy  of  description  of  the  development  process 
should  be  indicated  by  scoring  "1",  "2"  or  "3".  "Adequate  description"  is 
defined  as  sufficient  information  so  that  you  could  replicate  the  development 
process  as  a  researcheL. 

D.9.  Reliability  of  scores,  a.  Mentioned?  If  there  is  no  mention  in  the 
research  report  of  the  reliability  of  the  scores  for  the  dependent  measure  in 
an  effect  size,  code  "0".  (Although  "test  reliability"  is  technically 
ir.  .orrect,  code  reference  to  it  as  score  reliability.)  If  reliability  is 
mentioned,  but  no  coefficient  is  reported — e.g.,  a  statement  is  made  that 
"Reliability  was  found  to  be  adequate"  or  "Researchers  have  generally  found 
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the  instrument  to  possess  adequate  reliability" — code  "2".  Coc3e  "1"  only  if 
one  or  more  coefficients  are  reported  for  the  specific  dependent  measure. 

D.9.b.  Source,  If  "0"  or  "2"  was  coded  in  D.9.a.,  code  "0"  here.  If  a 
coefficient  or  coefficients  are  reported/  indicate  whether  they  were  computed 
using  data  from  the  sample  for  the  particular  research  study  ( "1")  or 
reported  from  other  research  ("2")/  or  a  combination  of  the  two. 

D.9.C.  Method.  Of  concern  here  is  the  method  used  to  estimate  the 
reliability  coefficient  reported  in  D.9.d  If  no  reliability  coefficient  was 
reported/  code  "O".  If  a  reliability  coefficient  is  reported/  Dut  y^u  rannot 
tell  what  type  it  is,  score  "1".  Corrected  split-half/  Kuder-Richardson/  and 
alpha  coefficients  are  all  internal  consistency  methods  ("3").  Note  that 
both  inter-observer  agreement  ("5"  and  "6") — i.e./  between  observer — and 
intra-observer  agreement  ("7"  and  "8")  —  scores  obtained  by  the  same 
observer — are  included.  "Categorization  relicibility"  ("9")  should  be  coded 
when  the  reliability  of  observational  scores  is  reported/  rather  than  simply 
reporting  inter-observer  or  intra-observer  agreement. 

D.9.d.  Coefficient.  If  a  reliability  coefficient  or  percentage  of  agreement 
is  reported  tor  a  dependent  measure/  insert  it  here  without  the  decimal 
point.  A  reliability  co-efficient  of  .80  would  be  entered,  80.  If 
coefficients  obtained  with  more  than  one  method  are  reported/  choose  one  to 
enter  using  the  following  order  of  preference:  (1)  Test-retest/  (2)  internal 
consistency/  (3)  alternate  forms.  "4=inter-observer — %"  and  "6=intra- 
observer — %"  refer  to  percentage  of  agreement  in  categorizations.  "5=inter- 
observer — r"  and  "7=intra--  observer — r"  refer  to  coefficients  for  the 
correlation  between    categorizations*    if  both  percentage  of  agreement  and  a 


442 

423:    D-7/CI  12 


correlation  coefficient  are  a  ail=»ble  for  inter^observer  or  intra-observer 
agreement/  enter  the  correlation  coefficient.  If  both  inter-  and  intra 
observer  agreement  are  available/  enter  the  inter-observer  agreement 
information.  If  inter-  and/or  intra-observer  agreement  are  reported  along 
with  categorization  reliability/  record  the  categorization  reliability 
figure. 

If  no  coefficient  is  reported/  enter  x's.  If  more  than  one  coefficient 
3  3  reported  for  the  preferred  method  or  in  a  general  reference  where  method 
is  not  indicated  (e.g./  "Researchers  have  reported  reliability  coefficients 
ranging  from  .60  to  .80")/  enter  the  median  of  the  coefficients  reported. 
That  is,  in  the  above  example/  70  would  be  recorded.  If  coefficients  are 
reported  for  two  forms  of  a  test,  one  of  which  is  used  for  a  pretest/  or  for 
pretest  and  posttest  data  for  the  same  test,  record  the  post test  coefficient. 

D.9.e.  Magnitude.  If  no  coefficient  was  reported/  code  "0".  Otherwise/ 
enter  the  number  which  indicates  the  range  within  which  the  reliability 
coefficient  for  the  particular  ES  fell. 

D.IO.  Validity  of  scores,  a.  Discussed.  If  there  is  no  discussion  of  score 
validity  (ol  of  "test  validity")  in  the  report/  code  "0";  but  if  validity  is 
mentioned  and  discussed/  even  if  somewhat  superficially/  score  "1".  If  a 
comprehensive  discussion  of  validity  is  presented — usually  encompassing 
mention  and  perhaps  justification  of  the  type  of  validity/  and  presentation 
of  validity  evidence  from  prior  studies  and/or  the  current  study/  code 
"2=comprehensively".  Mentioning  a  factor-  analysis  is  not  sufficient  alone; 
the  author(s)  must  indicate  specifically  that  they  see  the  relevance  to 
validity. 
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D.lO.b.  Type,  Code  "O"  if  "0"  was  coded  in  lO.a.  Code  "1",  "general"  when 
validity  is  discpssed  but  with  no  indication  of  what  type  of  validity  the 
author  or  authors  of  the  report  believe  is  involved.  Face  validity/  "2"/ 
refers  to  a  claim  in  a  report  that  validity  is  obvious  due  to  the  nature  of 
the  items  or  the  assessment*  If  construct  validity  is  alluded  co  in  terms  of 
either  discrimination  or  correlation — i.e./  that  the  means  of  distinct  groups 
are  different  as  predicted/  or  that  the  scores  correlate  with  other  non- 
attitude  scores  as  predicted  or  yield  factors  ac*  expected — or  if  reference  is 
made  only  to  "construct  validity"/  code  "3".  If  the  construct  validity  is 
estima':ed  by  having  experts  judge  the  items  and/or  the  total  test/  code  "4". 
If  the  validity  of  the  scores  is  estimated  through  their  correlation  with 
attitude  scores  obtained  on  another  attitude  instrument  or  assessment 
procedure/  code  "5=concurrent".  If  a  combination  of  types  of  validity  is 
discussed/  code  "6"/  and  enter  the  numbers  for  the  types. 

D.lO.c.  Sonrce.  If  a  "0"  has  been  coded  for  lO.a./  and  lO.b./  code  "0"  here. 
If/  however/  validity  is  discussed/  but  the  source  of  the  validity  evidence 
or  judgment  is  not  mentioned/  code  "1".  If  the  discussion  of  validity  is 
based  on  general  references  to  the  literature  without  citations  of  one  or 
more  specific  validity  studies/  code  "2".  If  one  or  more  validity  studies 
are  cited/  code  "3".  If  data  from  the  project  relative  to  validity  are 
presented/  code  "4".  If  validity  is  inferred  from  inspection  of  the  tests, 
but  without  collection  of  any  sort  of  data  and  without  reference  to  the 
literature  either  generally  or  specifically/   code  "5". 

D.lO.d.  Reactivity  of  measure.  The  question  here  is  the  extent  to  which  it 
is  obvious  that  the  assessment  is  related  to  desired  treatment  outcomes/  with 
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a  likely  effect  on  Ss'  responses  to  the  test  (affecting  the  validity  of 
scores)  and  to  the  the  treatment  (affecting  treatment/  and  external/ 
validity).  A  scale  such  as  the  ATDP  or  the  ATHI  so  obviously  assesses 
attitudes  toward  the  disabl  d/  with  the  most  likely  implicit  assumption  being 
that  positive  attitudes  are  good,  that  it  should  be  coded  "3=high".  On  the 
othe:^  end  of  the  scale/  when  nondisabled  Ss'  interactions  with  disabled  Ss 
are  observed  without  the  Ss'  knowledge/  reactivity  is  likely  to  be  absent  and 
"1"  should  be  coded.  An  attitude  scale  in  which  items  about  disabled  persons 
are  embedded  within  a  number  of  other  items  would  be  scored  "2=moderate";  so 
would  observations  where  the  observers  were  visible  to  the  Ss/  but  had 
observed  long  enough/  and  with  sufficient  unobtrusivensss/  that  the  Ss 
probably  had  become  acclimated  to  their  presence. 

D.lO.e.  Adequacy  of  validity.  Based  on  the  evidence  presented  in  the 
article/  indicate  how  valid  you  believe  the  dependent  measure  is  for 
assessing  attitudes  toward  disabled  persons  in  this  particular  study.  You 
will  have  to  weigh  the  evidence  for  validity  presented  in  the  report/  as  well 
as  the  information  given  about  the  test  and  its  relation  to  the  definition  of 
attitudes  for  the  study/  if  presented.  For  example,  if  the  ATDP,  an 
assessment  of  general  attitudes  toward  the  disabled/  is  used  as  an  assessment 
instrument  in  a  study  aimed  at  changing  attitudes  toward  persons  with  a 
particular  disabilility,  code  low  ("1");  even  though  if  used  appropriately  in 
a  study  aimed  at  modifying  attitudes  toward  disabled  persons  in  general/  the 
ATDP  would  be  coded  as  having  moderate  ("2")  validity.  Remember  that  the 
RGEPS  is  always  scored  as  having  low  ("1")  validity. 

D.ll.  Data  collection,  a.  Type.  If  you  cannot  discern  from  the  report  how 
the  data  were  collected,  code  "0".    If  data  were  collected  by  regular  staff 
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of  a  program  within  which  the  research  was  conducted — e.g./  teachers  at  a 
school  where  the  research  project  was  carried  out — code  "1".  If  the  project 
director/author  of  the  report  collected  the  data  himself  or  herself/  code 
"2".  If  the  data  were  collected  by  a  paid  assistant/  code  "3".  If  the  data 
were  collected  without  contact  between  project  personnel  and  the  respondent — 
e.g./  through  the  mail — code  "4".  Data  collected  over  the  telephone  would  be 
coded  according  to  the  person  who  did  the  telephoning. 

D.ll.b.  Blinded  collection*  The  question  here  is  whether  the  person  or 
persons  collecting  the  data  knew  the  treatment  groups  which  Ss  were  in  and 
whether  it  was  a  pre  or  post  assessment.  "Not  applicable"  ("1")  would  be 
scored  only  if  there  were  no  human  interaction  involved  in  the  collection/ 
such  as  data  collected  via  a  mail  survey  or  collected  automatically  as  part 
of  a  computerized  testing  program.  Code/  "no"  ("2")  for  group  paper-and- 
pencil  tests  and  for  interview  or  observational  data  unless  it  is  clear  that 
blinded  collection  was  done;  researchers  may  ov«arlook  the  importance  of 
blinded  collection  of  paper-and-pencil  data/  but  are  aware  of  the  need  for 
blinded  observation  and  interviewing^  and  are  likely  to  report  it.  Code  "3" 
in  the  situation  where  partial  information  was  kept  from  the  collectors  about 
the  group  membership  of  the  subjects  or  whether  a  pre  or  posttest  was  being 
administered.  Code  "4=yes"  only  if  it  is  clear  that  blinding  was  present. 
Often  the  lack  of  blinding  will  be  clear  from  the  way  in  which  the  study  is 
presented — for  example/  if  the  person  who  introduced  a  treatment  film  also 
administered    the  tests  following  the  film. 

D.12.  Blinded  scoring.  The  same  considerations  and  definitions  apply  here  as 
for  "blinded  collection".    The  "not  applicable"  ("1")  code  should  be  used  for 
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attitude  scales  for  which  scoring  involves  no  judgment/  only  the  addition  of 
pre-assigned  item  values. 

D.13.  Time  of  Posttest.  If  you  cannot  tell  from  the  report  how  long  after 
end  of  treatment  posttest  was  administered/  code  "0".  In  most  studies/  a 
posttest  will  be  administered  very  soon — immediately/  within  a  day/  or  at  the 
next  class  session — following  the  treatment.  In  that  case,  code  "1".  If  the 
administration  of  the  posttest  is  delayed — perhaps  to  obscure  the  relations 
between  the  treatment  and  the  test — code  "2".  If  the  first  posttest/  or 
another  form  of  it,  is  administered  as  a  follow-up/  code  "3".  If  more  that 
one  follow-up  administration  of  a  test  is  reported/  code  "3"  for  the  ES  from 
each  follow-up  testing. 

D.14.  Wed.Ks  after  intervention  to  posttest.  If  you  coded  "0"  in  D.13/  enter 
x*s  here.  If  you  coded  "1"/  "2"/  or  "3"/  the  report  should  contain  the  time 
between  the  end  of  treatment  and  posttesting  or  you  should  be  able  to 
estimate  the  time.  For  same-day  testing/  enter  zeroes.  Otherwise/  enter  the 
time  in  weeks,  rounded  to  one  decimal  place  (1  day  =  .1;  2  days  =  .3;  3  days 
=  .4;  4  days  =  .6;  5  days  =  .7;  6  days  =  .9).  That  is,  a  2-day  time  period 
would  be  entered  as  00.3  weeks.  If  you  coded  "3"/  enter  the  time  in  weeks 
between  the  end  of  treatment  and  the  follow-up  testing. 
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E.     INTERNAL  VALIDITY 
E.l.  Design.  first  three  categories  ("1",  "2",  "3")  are  the  Campbell  and 

Stanley  experimental  group  designs.  Each  assumes  random  assignment  of  the 
units  of  analysis  (whether  individual  subjects  or  groups)  to  the 
experimental  conditions.  For  a  design  with  random  assignment  in  which  a 
posttest  for  an  experimental  group  is  compared  with  a  pretest  used  as  a 
simulated  control  group  posttest/  code  "2".  The  only  one  of  Campbell  and 
Stanley's  quasi-experimental  designs  included  here  is  "4",  the  nonequivalent 
control  group  design.  (If  a  nonequivalent  control  group  design  is  used 
without  pretesting,  code  "8"  and  write  in  M,  posttest  only".) 

It  is  unlikely  that  the  literature  being  reviewed  will  contain  many 
reports  of  single  subject  research  ("5") — research  in  which  behavioral  data 
are  collected  on  individuals/  commonly  with  a  baseline,  followed  either  by  a 
treatment  and  withdrawal  of  treatment  to  see  whether  the  behavior  returns  to 
baseline;  or  using  a  multiple  baseline  design  in  which  the  treatment  is 
introduced  to  different  individuals  at  different  times,  without  withdrawal  to 
^  baseline,   to  determine  if  changes  in  behavior  occur  with  introduction  of  the 

treatment  as  predicted. 

Two  pre-experi mental  designs  are  included:     "6",   the  pre-post,  single- 

•  group  design,  in  which  there  is  no  control  group  and  one  group  is 
administered  a  pretest,  given  a  treatment,  and  then  administered  a  posttest; 
and  "7",  the  static  group  design,   in  which  the  researcher  identifies  already 

•  existent  groups,  one  of  which  has  received  a  treatment  of  interest  and  the 
other  one  of  which  has  not,  and  administers  a  posttest  to  each  to  examine  the 
effects  of  the  treatment.    This  is  a  design  which  may  turn  up  in  reports  of 

•  research  on  mainstreaming. 
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If  "8=other"  is  coded/  enter  the  Campbell  and  Stanley  design  name  and 
number  if  possible. 

E.2.  Assignment  to  groups.  The  preferred  method  of  assigning  subjects  to 
experimental  groups  is  to  assign  them  randomly.  If  a  report  states  that 
subjects  (or  groups/  if  that  was  the  unit  of  analysis)  were  assigned  randomly 
or  describes  a  process  of  assignment  which  was  random/  including  random 
assignment  from  strata  (e.g./  gender)/  code  "1".  If  Ss  were  matched  on 
relevant  variables  and  then  assigned  randomly  to  groups/  code  "2".  Random 
assignment  is  a  desired  attribute/  and  researchers  will  be  eager  to  report 
that  they  used  it.  So,  code  "1"  or  "2"  only  if  it  is  obvious  from  the  report 
that  a  random  process  was  used.  In  some  cases,  researchers  will  obtain  a 
treatment  group  and  then  select  control  subjects  randomly  from  another  larger 
group  or  select  from  a  larger  group  subjects  who  are  matched  with  individuals 
in  the  treatment  group  on  some  variable  or  variables.  In  either  case,  code 
"3".  "4"  would  be  coded  when/  in  a  nonequi valent  control  group  design/  the 
researcher  randomly  assigned  intact  groups  but  did  not  use  group  as  the  unit 
of  analysis,  "l'"/  convenience/  refers  to  what  is  actually  nonassignment — 
that  is,  simply  using  available  groups  as  they  arc  constituted.  The 
assignment  is  really  done  by  someone  other  than  the  researcher.  The  tyoical 
case  is  the  use  of  one  classroom  for  a  treatment  group  and  another  for  a 
control  group/  with  no  control  over  who  is  in  which  group.  In  a  design  such 
as  the  pre-experi mental/  pre-post/  one-group  design/  there  is  no  assignment 
to  groups,  and  "6"  should  be  coded. 

E.3.  Threats.  The  scale  used  here  to  code  threats  to  internal  validity  is 
the  same  as  that  used  for  coding  Treatment  Validity  (C.9./  p.  9).    In  cases 
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where  the  design  is  generally  assumed  to  control  a  threat  and  there  is  no 
reason  to  suspect  that  the  threat  is  not  controlled,  code  "l=not  a  plausible 
threat".  For  example,  random  assignment  is  an  excellent  coricrol  for 
testing  —  that  is,  if  groups  have  been  randomly  assigned  and  both  are 
administered  the  pretest,  the  effect  of  taking  the  pretest  should  be 
consistent  for  both  and  the  interest  is  in  whether  there  is  a  treatment 
effect  above  and  beyond  the  testing  effect  on  both  groups.  Maturation  is 
also  well  controlled  by  random  assignment,  as  are  statistical  regresson  and 
selection.  However,  if  inspection  of  the  data — for  example,  pretest  data — 
indicates  that,  3ven  with  random  assignment,  the  groups  were  initially 
different  on  important  variables,  selection,  and  by  extension,  maturation  and 
statistical  regression  (really  selection  by  maturation  interaction  or 
selection  by  regression  interaction),  are  also  plausible  threats.  When 
random  assignment  is  not  used/  selection  is  a  particularly  plausible  threat. 

If  there  is  no  evidence,  including  your  knowledge  of  research  settings, 
as  to  whether  a  threat  is  present  or  controlled,  code  "can't  tell"  ("0").  In 
some  cases,  a  substantial  ("3")  threat  or  major  ("4")  threat  mnv  be  present — 
for  example,  if  convenience  groups  were  used,  in  the  absence  of  evidence  that 
the  groups  were  similar  (and  even  without  evidence  that  they  were  different), 
a  substantial  selection  threat  ("3")  should  be  coded;  if  the  croups  were  not 
randomly  assigned  and  there  are  differences  between  them  that  might  be 
relatec^  to  treatment  outcomes,  or  if  scores  on  dependent  measures  were  based 
on  observer  inference  and  the  observers  were  not  blind  to  the  treatment 
conditions  and/or  to  whether  or  not  it  was  a  pretest  or  posttest  condition, 
4"  should  be  coded  for  selection  and  instrumentation,  respectively. 
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E.3.a.  Treatment  validity^  Whether  Ss  experience  the  treatment  as  intended 
is  an  important  spect  of  internal  validity.  Enter  the  code  from  D.9.b.  K3re. 
For  a  mainstreafning  study  for  which  category  D.9.b.  is  not  coded/  enter  an  x. 

E.3.b.  Maturation.  Maturation  refers  to  "biological  or  psychological 
processes  which  systematically  vary  with  the  passage  of  time"  (Campbell  and 
Stanley/  p.  8)/  independent  of  the  treatment.  Included  is  not  only 
physiological  or  cognitive  changes  due  to  growing  older/  but  having  subjects 
grow  hungry/  tired/  or  bokred.  Maturation  is  a  strong  threat  in  single-group 
designs?  with  treatment  and  placebo/control  groups/  the  concern  is  whether 
there  was  differential  maturation. 

E.3.C.  History.  Here  the  concern  is  with  events  other  than  the  treatment 
that  occurred  between  the  pretest  and  posttest  and  might  have  had  a 
differential  effect  on  results.  Of  concern  is  extra-session  history  (that 
is,  events  that  occur  outside  of  the  treatment  essions)  and  intra-session 
history  (that  is,  events  that  occur  within  treatment  sessions/  such  as 
distractions  from  the  noise  of  the  lawn-n:ower/  for  only  one  group). 

E.S.d.  Testing.  Simply  taking  a  test  may  improve  Ss'  performace  when  they 
take  the  test  again  (including  the  taking  of  an  alternate  form). 
Sensitization/  as  well  as  practice/  is  a  part  of  this  effect.  That  is,  if 
the  purpose  of  a  test  is  obvious/  as  on  an  attitude  scale/  people  may  be  more 
likely  to  change  behavior  (positively  or  negatively)  on  a  second  testing  as  a 
result  of  the  first  testing  than  if  purpose  of  the  test  is  not  obvious  to 
them.  Testing  is  a  strong  threat  ("4")  with  a  pre-post/  single-group  design; 
with  treatment-control  designs/  the  question  is  if  there  is  a  differential 
effect  between  the  experimental  and  control  groups. 
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E.3.e.  Instrumentatioru  ''Instrument  decay"  is  another  term  used  to  refer  to 
differences  or  changes  in  instrumentation  which  might  account  for  an  apparent 
treatment  outcome.  Included  would  be  differences  in  test  administration 
(e.g./  a  t*=>st  administrator  who  is  more  positive  and  encouraging  with  the 
treatment  group  that  with  the  control  group)  or  data  collection^  especially 
where  observers  code  behaviors  or  comirents  and  know  they  are  observing 
different  treatment  groups  or  know  which  is  tht^  pre-  or  the  posttest.  When 
there  is  no  information  to  indicate  that  the  administrator  of  a  paper-and- 
pencil  test  is  blind  to  3s'  group  memberships,  code  a  "2".  Lack  of  blind 
testing  with  observations  or  interviews  would  merit  at  least  a  "3"/  and 
perhaps  a  "4"  depending  on  the  circumstances. 

E.3.f.  Statistical  regression.  On  retesting,-  when  a  lesc  thai*  fully  reliable 
test  is  used,  the  means  of  groups  will  tend  to  regress  toward  the  mean  of  the 
population  from  wnich  they  came.  This  effect  will  be  greater  for  extreme 
groups.  Consequently,  when  one  group  initially  has  a  more  extreme  mean  than 
another  group,  the  regression  to  the  mean  will  be  greater  and  might  be 
confused    with  a  treatment  effect. 

E.3.g.  Selection.  When  groups  have  different  characteristics  at  the 
beqinning  of  the  research  study  that  might  account  for  treatment  outcomes, 
this  threat  to  internal  validity  is  referred  to  as  "selection".  Random 
assignment  is  an  excellent  control  for  this  threat,  but  pretreatment 
differences  can  still  be  large  by  chance.  If  there  is  evidence,  such  as  pre- 
treatment data,  to  indicate  that  the  groups  did  not  turn  out,  by  chance,  to 
be  quite  different  on  pertiner.  variables,  or  if  there  was  random  assignment 
from  strata  of  a  relevant  variable/    code  "1"  with  random  assignment. 
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Otherwise/  with  random  assignment/  code  "0"/  unless  there  is  evid^^nce  of 
pretreatment  differences;  then,  "2",  '*3",  or  "4"  should  be  coded.  Whenever 
convenience  samples  are  used/  selection  is  likely  to  be  a  threat,  with  a 
judgment  about  the  degree  dependent  upon  the  information  that  is  provided  or 
implicit  about  the  groups. 

E,3,h,  Experimental  mortality.  When  subjects  are  lost  or  drop  out 
differentially  from  experimental  groups,  outcomes  may  be  due  to  this 
differential  loss  of  subjects  rather  than  to  the  treatment.  Although  the 
basic  question  is  whether  the  cnaracteristics  of  those  who  dropped  out  or 
were  otherwise  lost  from  different  groups  were  sucn  as  to  affect  outcomes, 
differential  rates  of  loss  or  drop-out  suggest  that  treatment  effects  might 
be  a  result  of  loss  of  subjects  who  differed  on  variables  related  to 
outcomes.  E'^en  though  the  author  does  not  mention  mortality,  you  may  be  able 
to  pick  it  up  through  observing  discrepencies  between  the  n's  reported  and 
the  d.f.  in  reports  of  analyses. 

E.3.i.  General  internal  validity.  Based  on  your  coding  of  the  seven  threats 
to  internal  validity  and  your  general  evaluation  of  the  design,  assign  a 
General  Internal  Validity  rating.  Perhaps  the  best  way  to  define  high  ("1")/ 
medium  ("2")^  and  low  ("3")  validity  is  by  specifying  the  extremes.  A  study 
should  receive  a  "1"  if  there  is  evidence  that  an  excellent  design  was  well- 
executed.  In  that  case,  there  should  be  no  more  than  one  or  t,.o  "0"  ratings 
and/or  no  more  than  one  or  two  "2"  ratings.  On  the  other  end  of  the  scale, 
low  ("3")  would  be  coded  for  a  study  which  had  one  or  more  "4"  ratings,  cr 
two  or  more  "3"  ratings,  three  or  more  "2"  ratings,  or  four  or  more  "0" 
ratings.    In  addition,  a  design  with  one  "3"  rating  with  two  or  three  '*2" 
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ratings  shoulc^  'le  coded  "3".  Medium  internal  validity  (  "2")  would  then  be 
coded  for  a  study  which  had  no  "4"  rating,  at  the  most  one  "3"  rating,  fewer 
than  two  "2"  ratings,  and  fewer  than  three  "0"  ratings. 
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F.  RESULTS 

F.l.  Statistical  significance^  Statistical  significance  will  only  be  coded/ 
if  available/  in  terms  of  whether  results  were  or  were  not  significant 
at  the  .05  level. 

F.2o  Author's  conclusions  about  effectiveness,  a.  Qualified?  Here  the 
question  is  whether  the  author  limited  his  or  her  interpretation  of  the 
results  based  on  the  characteristics  of  the  study  or  the  need  to  replica  e 
the  results.  Authors  may  limit  thex^.  conclusions  by  reference  to  the 
restricted  nature  of  their  sainple  ("1")/  by  reference  to  possible 
interactions  between  the  treatment  and  personological  or  ecological 
population  characteristics  ("2")/  features  of  the  design  ("3")/  limitations 
in  the  methods  used  to  assess  outcomes  ("4")/  the  lack  or  inadequacy  of 
efforts  to  verify  that  the  treatment  was  actually  implemented  ("5")/  the  need 
to  replicate  the  study  to  determine  whether  the  results  are  reliable  or 
generalizable  (*6")/  or  some  other  limitacion  ("7")  or  combination  of 
limitations  ("8").  If  the  la3t/  enter  the  numbers  of  the  limitations  in  the 
combination.  Recommendations  for  future  research  are  not  qualifications  per 
se;  they  are  qualifications  only  if  st'ated  in  times  of  caution  in 
ii  ^rpreting  the  results  of  the  reported  study.  For  an  interaction  ES»  enter 
an  X. 

F.2.b.  Treatment  effective?  This  category  doc  lot  ask  for  your  judgment  in 
regard  to  the  effectiveness  of  the  treatment/  but  asks  you  to  code  the 
author's  conclusion  about  treatment  effectiveness.  If  the  author  does  not 
indicate  whether  be  or  she  believed  the  trea^'nent  was  effective/  code  "0". 
Otherwi':^/  code  "1"  if  ♦'he  conclusion  is  that  the  treatment  had  no  effect 
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(including  the  lack  of  Treatment  A  vs.  B  differences);  code  "2'*  if  the 
author  concluded  that  thj  outcomes  were  equivocal — that  is,  that  a  conclusion 
that  the  treatment  was  effective  was  not  clearly  supported;  code  "3"  if  the 
conclusion  is  that  the  treatment  did  produce  a  positive  effect  on  attitudes 
toward  disabled  persons;  code  "4"  if  it  was  concluded  that  the  result  was  a 
negative  effect — that  is*  the  result  was  less  positive  attitudes  toward 
disabled  persons.  Remember,  what  is  sought  is  the  author's  conclusions  in 
regard  to  a  treatment's  effectiveness  relative  to  another-  and  this  judgment 
may  vary  across  dependent  measures  and,  thus,  ESs.  For  an  interaction  ES, 
enter  x. 

For  Prior  Contact  ESs,  the  question  is,  did  prior  contact  have  an  effect 
on  treatment  outcomes  and,  if  so,  what  effect? 

F.3.  Effect  size(s).  a.  Available.  If  information  to  compute  an  ES  is  not 
available  even  though  information  on  statistical  significance  is,  but  the 
direction  of  the  difference  cannot  be  determined,  code  "O".  If  information  is 
available  to  calculate  or  estimate  an  effect  size — a  D,  correlation 
coefficient,  or  both — code  "1",  unless  the  effect  size  involves  a  negative 
attitude  change  or  no  change  by  a  treatment  group.  Then  code,  "2",  "negative 
change".  The  "2"  code  is  not  applicable  to  post test-only  designs,  but  is 
applicable  to  pre-post  design/:.  In  a  "Treatment  A  vs.  B"  comparison,  code 
"2"  if  the  group  exposed  to  either  treatment  (anticipated  to  have  a  positive 
effect  on  attitudes)  has  a  decline  or  no  change  in  assessed  attitudes.  Use 
the  "2"  code,  if  appropriate,  with  follow-up  as  well  as  immediate  or  delayed 
pre-post  comparisons,  and  with  secondary  effect  sizes.  If  an  ES  cannot  be 
computed,  but  statistical  significance  is  available  and  the  direction  of 
difference   is  negativ  /    code  "3".      If  an  ES  cannot   be   computed,  but 
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statistical  significance  is  available  and  the  direction  of  difference  is 
positive,  code  "4". 

F.3.b.  D.  1)  +  £.  If  a  £  can  be  computed  or  estimated — see  the 
COMPUTATION  OF  EFFECT  SIZES  section— code  it  here.  Include  a  +  or  a  -  sign 
to  indicate  the  direction  of  the  effect  size,  with  a  +  indicating  a 
difference  in  the  direction  of  more  positive  attitudes  for  the  treatment 
groups  or  from  the  pre-  to  post  test.  When  you  enter  a  -  sign,  always  circle 
it  as  a  reminder  to  the  key  entry  operator  that  a  -  is  a  value  to  be  punched, 
not  an  indicator  of  missing  data.  WiLn  a  Treatment  A  vs.  B  comparison, 
assume  that  Treatment  A  (the  first  group)  is  the  treatment  and  assign  a  +  or 
"  accordingly.  With  an  interaction  effect,  no  D  can  be  computed,  and  x's 
should  be  entered. 

In  u^ing  ES  Information  Missing  CODING  INSTRUMENTS,  enter  a  +  or  «  sign 
(in  this  case,  circling  either  so  that  it  will  stand  out  for  the  key  entry 
operator)  if  the  direction  of  the  group  difference  can  be  determined  even 
though  an  ES  cannot  be  computed.  Enter  x*s  in  the  remaining  spaces.  If 
direction  of  difference  cannot  be  determined,  enter  an  x  in  the  first  space 
and  zeroes  in  the  remaining  spaces. 

F.3.b.2)  Source.  If  no  £  is  available,  as  for  an  interacticj^  ES,  code  "0". 
If  an  appropriate  standardized  mean  difference  is  reported,  code  "1".  If  you 
calculate  -  £  using  any  of  the  means  and  standard  dev;ations  indicated  in 
3.b.3)  and  3.b.4)  below,  and  discussed  in  the  COMPUTATION  OF  EFFECT  SIZES 
section,  code  "2".  If  the  £ 13  estimated,  based  on  a  t-test  for  comparison 
between  means  or  the  F  from  a  one-way  analysis  of  variance  with  only  two 
groups,  code  "3".    If  the  D  is  estimated  from  a  t-test  for  correlated  means 
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for  which  tlie  correlation  is  available/  code  "4".  If  the  correlation  is  not 
available  and  an  estimate  of  .50  was  used,  code  "5".  If  D  is  estimated  from 
the  F  for  the  treatment  effect  in  a  factorial  analysis  variance  in  which 
there  are  only  two  levels  of  treatment/  code  "6".  If  the  D  is  estimated  from 
an  F  for  a  one-way  analysis  of  covariance  for  which  there  is  only  two  groups/ 
and  the  correlation  between  scores  on  the  dependent  measure  and  the  covariate 
is  available/  code  "7";  if  the  dependent  measure-covariate  correlation 
coefficient  is  not  available  and  .50  is  used  as  an  estimate;  code  "8".  If 
the  data  were  reported  in  proportions  and  chi-squared  or  is  used  to 
estimate  D,  code  "9".  If  the  corresponding  t  value  for  a  Mann-Whitney  U  or 
some  other  nonparametric  statistic  other  than  chi-squared  is  used  to  estimate 
£/  code  "10".  If  an  rpj^  is  the  only  information  reported/  and  you  use  that 
to  estimate  D,  code  "11".  If  the  only  information  provided  in  the  report  is 
an  exact  significance  level  from  which  you  determine  the  value  of  a  statistic 
(t/  F/  Z/  chi-square)  and  then  use  that  to  estimate  D/  code  "12". 

F.3.b.3)  Scale  of  x"  difference.  In  F.3.b.3)  and  F.3.b.4)/  you  are  to  provide 
information  about  the  elements  used  in  computing  the  £  for  the  particular 
effect  size,  if  "2"  was  coded  in  F.3.b.2)  above.  If  any  category  other  than 
"2=calculated"  was  coded  in  F.3.b.2),  code  "O=not  applicable".  The 
Categories  of  mean  differences  correspond  to  those  discussed  in  COMPUTATION 
OF  EFFECT  SIZES,  If  the  means  used  in  computing  D  were  the  unadjusted 
posttest  mean'  ,  code  "1".  If  a  posttest  mean  is  compared  with  the  pretest 
mean  of  another  group  to  simulate  a  control  group  comparison,  code  "1".  If 
pretest  and  po^, '•xest  means  were  available  so  th^t  you  could  subtract  the 
pretest  from  the  posttest  means  and  obtain  mean  gains  for  each  group  to  use 
in  the  computation  of  D/  or  if  change  means  were  reported/  code  "2".  If 
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means  adjusted  as  part  of  an  analysis  of  covariance  were  used/  code  "3".  If 
the  researcher  used  the  correlation  betv/een  the  pretest  and  the  posttest  to 
predict  posttest  scores  and  then  computed  residuals  (the  differences  between 
actual  and  predicted  posttest  scores)/  and  the  mean  residuals  were  used  to 
compute  D/  code  "4". 

F.3.b.4)  Standard  deviation.  Again/  unless  "2"  was  coded  in  F.3.b.2/ 
indicating  that  you  were  able  to  calculate  a  £  directly/  code  "O".  If  the 
standard  deviation  used  to  compute  D  was  based  on  posttest  data  for  a  control 
or  placebo  group  (or  pooled  standard  deviations  in  the  case  where  there  was 
more  than  one  control  or  placebo  group)/  or  if  it  comes  from  a  pretest  which 
was  treated  as  a  posttest  for  the  purpose  of  a  simulated  control  group 
comparison/  code  "1".  If  the  standard  deviation  used  in  computing  £  came 
only  from  pre-test  scores — as  in  pre-post/  one-group  designs  or  when  control 
group  posttest  standard  .eviations  aren't  reported — code  "2".  If  both  a 
posttest  standard  deviation  for  the  control  group  and  pre-test  standard 
deviations  on  the  dependent  measure  for  the  control  and  experimental  groups 
are  available  to  be  pooled  (using  the  formula  presented  in  the  COMPUTATION  OF 
EFFECT  SIZES  section)/  code  "3".  In  the  case  that  the  standard  deviation 
must  be  obtained  from  the  pooled  posttest  sums  of  squares  based  on  all  of  the 
groups  in  the  design  (see  the  COMPUTATION  OF  EFFECT  SIZES  section)/  code 
"4"/  "5"/  "6"/  or  "7"/  as  appropriate.  If  the  only  standard  deviation 
available  is  the  control  group  standard  deviation  for  covariance  adjusted/ 
residual/  or  gain  scores  and  you  are  zble  to  estimate  the  unadjusted  control 
group  standard  deviation  using  one  of  the  formulae  in  the  COMPUTATION  OF 
EFFECT  SIZES  section/  code  "8"  for  an  adjusted  covariance  or  residual 
standard  deviation  or  "9"  for  an  adjusted  gain  score  standard  deviation. 
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F.3.b.5)  Mdn  primary  D  for  type  of  assessment.  In  order  to  have  one  D  for 
each  type  of  assessment  (as  coded  in  D.6)  which  figures  in  the  primary  ES(s) 
for  a  research  report/  enter  the  median  D  here  in  the  columns  for  primary 
ESs.  (Enter  x's  in  the  secondary  ES  columns.)  If  che  number  of  D's  is  odd/ 
enter  the  middle  one.  If  the  number  of  £'s  is  even,  take  the  mean  of  the  two 
(or  more)  middle  values.  If  there  is  only  one  £  for  a  type  of  assessment/ 
re-enter  it  here.  If  the  report  contains  ESs  for  attitude  change  efforts 
prior  to  mainstreaming  and  for  mainstreaming/  compute  separate  median  £  (and 
correlation  coefficients/  F.3.C./  and  variance  ratios/  F.3.d)  for  the  "prior" 
primary  ESs  and  the  "mainstreaming"  primary  ESs.  Also/  if  follow-up  means 
are  reported/  compute  median  D's,  correlation  coefficients/  and  variance 
ratios  separately  for  the  immediatr-*  (or  delayed)  assessments  and  the  follow- 
up  assessment  or  assesjsments.  Compute  separate  median  ESs  also  if  there  is  a 
a  "true"  within  study  replication  (e.g./  Experiment  I  and  II)  or  if  results 
from  different  populations  of  Ss  are  reported  in  what  amount  to  two  different 
studies  (e.g./  use  of  a  curriculum  with  attitude  assessments  of  the  students 
who  study  it  and  the  teachers  who  teach  it).  If  the  "within  study 
replication"  coded  in  category  A.9.b.  is  a  pseudo-replication  (i.e./  you  have 
coded  it  because  o^  the  different  levels  of  subjects/  such  as  grade  levels)/ 
compute  one  median. 

F.3.b.6)  Mdn  primary  D,  overall.  Here  the  purpose  is  to  indicate  the  avc?rage 
£  for  all  dependent  measures  that  figure  in  primary  ESs  for  the  study. 
(Compute  the  mpdian  as  indicated  in  3.b.5.)  If  there  is  only  one  £  for  the 
study/  re-enter  it.  Enter  x's  in  secondary  ES  columns.  Compute  median 
primary  overall  D's  separately  in  the  same  circumstances  as  in  F.3.b.5. 
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F.3.C.  Correlation,  1)  Type,  If  no  correlation  coefficient  can  be  obtained/ 
code  "0".  For  any  effect  size  which  involves  the  comparison  of  two  group 
means  on  a  dependent  measure,  obtain  the  point  biserial  correlation  {rp^)t  as 
described  in  the  COMPUTATION  OF  EFFECT  SIZES  section,  and  code  "1".  For  an 
interaction  effect,  Eta^  is  the  appropriate  correlation  coefficient  ("2")« 
When  data  are  reported  in  terms  of  proportions  of  responses  for  two  groups 
being  compared,  the  Phi  coefficient  will  be  the  appropriate  correlation 
coefficient  (^3"). 

Recall  that  when  a  F-ratio  is  reported  for  multiple  means  but  the  means 
are  not  reported,  no  effect  sizes  caii  be  obtained  for  pairs  of  means.  3y  the 
same  token  an  analysis  of  proportions  which  includes  more  than  two  groups 
will  not  /ield  effect  sizes,  unless  contingency  tables  are  reported  so  that 
you  can  break  out  the  data  by  pairs  of  groups  and  compute  your  own  chi- 
squares  and,  then,  effect  sizes.  Similarly,  if  an  analysis  of  proportions  for 
two  groups  has  more  than  two  levels  of  response  (e.g.,  a  greater  thcii  2~by~2 
contingency  table,  in  which  there  are  two  columns,  or  rows,  for  treatment 
groups  but  three  rows,  or  columns,  for  attitude  assessment  items),  it  will 
usually  not  be  possible  to  obtain  an  effect  size.  You  might  be  able  to  break 
such  a  table  into  a  number  of  two-by-two  tables  to  get  separate  effect  sizes. 
Or,  there  may  be  occasions  when  the  various  levels  of  the  dependent  measure 
are  equivalent  indicators  of  attitudes  and  all  of  the  results  are  in  the  same 
direction.  In  that  case,  Cramer's  V,  an  extension  of  the  Phi  coefficient  can 
be  computed  (see  the  COMPUTATION  OF  EFFECT  SIZES  section)  and  should  be 
coded. 

I  3.C.2)  +  coefficient.  With  Eta^  for  an  interaction  effect,  no  direction  of 
relationship  can  be  specified,  and  an  x  should  be  coded  in  the  first  space. 
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For  the  other  coefficients/  a  direction  can  be  specified  through  inspection. 
Enter  a  +  if  the  treatment  group  has  the  higher  mean  or  proportion  and  a  -  if 
the  control  group  does.  Check  back  to  item  F,3,b,/  as  the  +  or  -  signs 
should  be  consistent  for  that  item  and  this  one.  With  Treatment  A  vs,  B 
comparisons/  assign  a  direction  as  indicated  in  F,3,b,l),  Be  certain  to 
circle  a  -  when  you  enter  it, 

F,3,c,3)  Source,  If  no  correlation  was  available  for  F,3,c,2)/  code  "0", 
Indicate  here  whether  the  correlation  which  you  recorded  in  F,3,c,2)  came 
from  the  research  report  ("1")/  from  your  calculations  ("2")  (as  when  a  Phi 
coefficient  is  calculated  from  a  chi-square)/  or  was  estimated  from  a  D 
("3"), 

F,3,c  4)  Mdn  primary  coefficient  for  type  of  assessment.  The  purpose  here  is 
the  same  as  for  D's  in  F,3,b,6),  Compute  the  median  coefficient  in  the  same 
way;  enter  in  primary  ES  columns/  with  x's  in  secondary  ES  columns, 

F,3,c,5)  Mdn  primary  coefficient/  overall.  The  purpose  here  is  the  same  as 
for  D's  in  F,3,b,7),     Compute  the  median  coefficient  in  the  sa.ne  way, 

F,3,d.  Variance  ratio,  1)  Ratio  (Sg^/S^^),  Of  interest  here  is  whether 
there  is  any  indication  that  aftt.r  treatment  the  experimental  group  has  Ipss 
or  greater  variability  than  an  untreated  population  or  a  differently  treated 
population.  The  variance  ratio  is  obtained  for  primary  ESs  by  dividing  the 
variance  for  the  treatment  group  by  the  variance  for  the  control  or  placebo 
group  for  the  particular  effect  size  being  coded.  In  the  case  where  the 
effect  size  involves  a  Treatment  A  vs.  B  comparison/  put  in  che  numerator  the 
variance  for  the  group  which  you  have  coded  as  the  experimental  group 
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• 

(Treatment  A)  on  the  earlier  pages  of  the  codinq  instrument  and  put  the 

variance  for  other  group  (Treatment  B)  in  the  denominator.    The  variance  used 

• 

must  not  be  pooled — i.e./  it  is  not  to  come  from  pooling  the  control  group 

posttest  variance  with  pre-test  variances/  nor  can  a  variance  ratio  be 

calculated  when  D  is  computed  based  on  an  estimate  of  the  variance  obtained 

• 

from  a  sum  of  squares  that  includes  the  treatment  posttest  scores.  For 

secondary  ESs  other  than  within-gender  treatment  comparisons/  enter  x's  . 

F,3,d,2)  Median  primary  variance  ratio  for  type  of  assessment.     The  purpose 

• 

here  is  the  same  as  for  D*s  in  3,b.6),    Compute  the  median  coefficient  and 

enter  in  the  same  way. 

• 

F,3,d,3)  Median  primary  variance  ratio/  overall.    The  purpose  here  is  the 

same  as  for  D's  in  3,6,7),    Compute  the  median  coefficient  and  enter  in  the 

same  way. 

• 

• 

• 

• 

• 

4fi3 

ERLC 

444:    F-9/CI  15 

G.  SUPPLEMENTAL  INFORMATION 
G.l.  Information  gain.  This  item  is  relevant  for  studies  in  which  the 
presentation  of  information  is  the  treatment/intervention  to  change 
attitudes.  The  question  is  whether  the  intervening  variable  of  information 
gain  was  established.  For  reports  for  which  "2=information"  was  not  scored 
in  C.4.  Treatment/ Intervention  Technique/  categories  G.l. a.,  b.,  &  c.  should 
be  coded  as  "not  applicable"  and  x's  inserted  in  G.l.d.  If  another 
intervening  variable  is  ^he  object  of  the  treatment  (such  as  anxiety)/  is 
assessed/  and  ESs  can  be  computed/  code  it  here  and  make  a  note  on  the 
COMMENTS  ON  CONVENTIONS  Sheet. 

G.l. a.  Reported?  Code  here  whether  information  gains  were  reported  for 
relevant  studie^v  Code  "0"  if  information  was  not  scored  in  C.4.  as  the 
intended  treatment/intervention  technique.  Code  ES's  separately.  That  is, 
if  information  gains  are  reported  for  the  groups  in  some  ESs  but  not  others/ 
code  accordingly  in  the  ES  columns. 

G.l.b.  Type  of  report.  "l=verbal"  refers  to  verbal  descriptions  of 
information  gain  outcomes  with  no  reference  to  statistical  significance  and 
no  reporting  of  means  or  standard  deviations.  If  the  author  refers  to 
"significant"  or  "nonsignificant"  results,  or  reports  levels  of  P,  code  "2". 
If  means  (or  medians)  alone,  or  with  standard  deviations,  are  reported,  code 
"3". 

G.l.c.  Conclusions.  Code  here  the  author '6  conclusions  in  regard  to 
information  gains.  If  information  gain  is  not  reported  (i.e./  0  or  2  was 
coded  in  G.l. a.),  code  "0",  "not  applicable".  If  no  conclusion  is  stated, 
but  there  is  a  report  of  gains  (i.e.,  "l=yes"  was  coded  in  G.l. a.),  select  a 


ERIC  445:    G-l/CI  16  "^^'^ 


code  based  or  how  the  reported  results  would  typically  be  interpreted  in 
educational  research  reports.  example/  if  the  result  is  statistically 

significant/  or  without  statistical  significance/  if  an  ES  is  1  or  larger/ 
code  "1"  ("clear  gain").  If  there  is  not  enough  information  to  decide  on 
"1"/  "2"/  or  "3"/  or  if  the  evidence  is  inconclusive  (e.g./  a  small  ES)/  code 
"4". 

G.l.d.  Effect  size — D.  If  a  for  information  gain  can  be  computed  or 
estimated  (see  COMPUTATION  OF  EFFECT  SIZES)  for  a  primary  or  secondary 
attitude  ES,  enter  it  here/  with  a  +  or  -  sign.  If  D's  are  available  for 
more  than  one  information  test,  enter  the  median  D.  In  general/  if  "0"  or 
"2"  were  coded  in  G.l.a./  if  a  £  cannot  be  computed  or  estimated/  or  if  the 
attitude  ES  is  an  interaction  ES,   enter  x's. 

G.2.  Type  of  study>  The  purpose  of  this  category  is  to  identify  reports  of 
evaluations  of  ongoing/  in-place  courses/  "1"  (such  as  a  regular  college 
abnormal  psychology  or  special  education  course  or  an  inservice  course)/  or 
programs/  "2"  (such  as  mains treaming  or  a  master's  or  other  degree  program)/ 
which  involve  no  special  treatment/  as  contrasted  with  reports  of  studies  of 
treatments  or  interventions  introduced  in  ordfr  to  investigate  the  effects 
experimentally/  "3".  In  some  cases /  this  category  will  be  redundant  with  the 
code  "14"  or  "15"  under  category  C.4.a.  Information  (2)  Delivery  mode  (p.  5 
of  the  Coding  Instrument)  when  the  course  or  program  conveys  information/  but 
it  will  pick  up  the  evaluation  of  regular  program  components  that  are,  for 
example/  of  a  contact  nature  such  as  practica.  On  the  other  hand,  the 
categories  in  10.  Mainstreaming  do  not  provi  Je  a  place  to  enter  variations  in 
time  of  mainstreamed  contact.  Studies  that  investigate  Variations  in  contact 
time  should  be  coded  "4"  for  later  identification. 
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CODING  SUMMARY 

!•  Minutes  spent  coding.  On  the  cover  of  the  coding  sheet/  you  will  have 
indicated  the  time  when  you  started  tc  read  the  research  report  and  when  you 
completed  the  coding  instrument/  whether  in  one  or  more  settings.  Enter  here 
the  total  number  of  minutes  for  reading  and  coding. 

2.  Coder.  Enter  here  your  code  number  as  indicated  on  the  coding 
instrument. 


E^TURN  TO  1  AND  COMPLETE  CHECK  LIST. 


'  ERIC  447:    H-l/CI  16 


CHECKLIST 

!•  Citation  checked.  Check  the  citation  for  the  report  as  entered  in  the 
alphabetical  listing  of  reports  to  be  sure  that  all  information  is  correct. 
Indicate  any  changes  and  give  to  the  secretary.  Once  done,  put  a  check  in 
the  space  provided, 

2,  References  checked.  If  the  reference  list  for  the  report  has  been 
checked  to  determine  whether  it  contains  citations  for  reports  to  be  included 
in  the  review  cf  lj*^.^ature  and  any  such  citations  have  been  taken  off  to  be 
followed  up  on,  place  a  check  in  the  space  provided. 

3,  Every  space  marked.  Tn  be  certain  that  no  available?  information  is  left 
uncoded,  it  is  important  that  every  coding  space  be  filled  in,  with  a  code 
number  or  an  x  to  indicate  the  lack  of  information.  Skim  the  Coding 
Instrument  to  be  certain  all  spaces  are  filled  and  that  all  numerals  are 
legible.    Then  place  a  check  in  the  t^pace  provided, 

4,  ES  oata  available,  Ii  the  data  to  compute  all  of  the  relevant  ES(s)  were 
available,  write  "y^s"  in  the  space  provided,  if  not,  write  "no"  or 
"requested",  as  appropriate, 

5,  ESs  computed.  If  computation  of  ESs  is  complete,  check  space  provided. 
Otherwise,  leave  space  blank, 

6,  ESs  checked.  Leav'^'  space  blank  until  someone  ether  than  the  person  who 
originally  coded  the  report  checks  on  the  appropriateness  and  computations  of 
any  ES(s),  Then  the  person  who  checked  the  ESs  is  to  enter  his/her  initials 
in  the  space  provided. 
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?•  Comments  on  Conventions  Sheet,  If  a  COMMENTS  ON  CONVENTIONS  sheet  was 
used/  write  "y®s"  in  the  space.    If  not,  enter  "no". 

8.  Comments  on  Study  Sheet.  If  a  COMMENTS  ON  STUDY  sheet  has  been 
completed/  write  "yes"  in  the  space.     If  none  was  needed/  enter  "no*. 

9.  Scoring  log  completed.  The  SCORING  LOG  provides  a  record  of  the  reports 
each  scorer  has  ceded/  including  the  serial  order  and  whether  they  were 
scored  as  agreement  checks*  Once  the  LOG  item  is  completed/  check  the  space 
provided. 

10.  Report  disposition  sheet  completed.  The  last  thing  to  be  done  is  to 
complete  the  REPORT  DISPOSITION  sheet.  Once  it  is  filled  in,  check  the  space 
provided  in  the  checklist. 
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I.  PRIOR  CONTACT  CODING  SHEET 
The  particular  interest  is  in    whether  there  were  prior  contact  effects 
on  treatment  outcomes  with    the  dependent  measures  and  samples  identified  as 
^  relevant  for  the  primary  ESs  already  coded.    (Control  of  prior  contact  as  an 

antecedent  variable  is  a  secondary  interest.) 

For  studies  where  the  effects  of  prior  contact  on  treatment  outcomes 
were  assessed  and  effect  sizes  can  be  obtained^  enter  the  code  to  identify 

I 

-lie  type  of  '  /mparison  in  each  secondary  prior  contact  ES  in  A.6.d.  on  the 
original  Coding  Instrument.    Enter  "^3"  if  there  is  an  analysis  of  prior 

^  contact  levels  within  the  treatment  group;  enter  "14"  if  there  is  analysis  of 

treatment  effects  within  prior  contact  levels  across  groups  (as  treatment 
effects  within  levels  of  gender  were  coded);  enter  "15"  if  the  interaction  of 

^  prior  contact  and  treatment  is  analyzed.    If  "15"  can  be  coded,  "^o  not  code 

"13"  or  "14".  If  "14*'  can  be  coded,  do  not  code  "13".  If  more  than  one 
definition  of  prior  contact  (e.g.,    number  and  frequency)   are  analyzed 

|.  separately,  or  if  quality  of  prior  contact  is  analyzed  separately,  each  is 

the  basis  for  an  ES. 

Complete  a  PRIOR  CONTACT  EFFECT  SIZES  sheet  for  prior  contact  ESs  coded 

I  in  A.6.d.,  based  on  the  types  of  prior  contacts  in  1.12.,  and  for  any  quality 

(1.13.)  ESs  that  could  be  computed.    Enter  the  quality  ESs  last.  Compi 
on  a  PRIOR  CONTACT  EFFECT  SIZES  COMPUTATIONS  she^t.     Use  the   sta...  .rd 

I  deviation  used  for  computing  treatment  Ds.     In  computing  Ds  for  within 

treatment  group  ESs  ("13"  in  A.6.d.),  treat  "no  contact"  or  "neutral  (in  the 
case  of  quality),  as  the  control  condition.    That  is,  place  the  dependent 

^  variable  mean  for  that  classification  second  in  the  D;  compare  the  other 

means  with  it.  In  the  case  of  quantity  of  contact,  compare  means  only  against 

r 
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"none"/  not  with  each  other.    In  the  cass  cf  quality/  also  compare  dependent 
variable  mear  s  for  positive  versus  negative  groups. 
For  coding  riittcnatives/  see  Table  2. 

1.5.  Prior  contact  assessed.  If  there  is  no  mention  of  the  extent  of  the 
Ss*  prior  contact  with  disabled  persons/  code  "0".  If  prior  contact  is 
assessed/  code  "1".  If  there  is  no  explicit  assessment  but  prior  contact  is 
implicit  (e.g./  the  Ss  are  either  exp  rienced  psychiatric  nurses  or  nurses  in 
training)  and  prior  contact  is  considered  by  the  author(s)  to  be  a  relevant 
variable  (i.e./  t^rtinent  to  treatment  outcomes)/  code  "2".  If  "0"  is  coded/ 
cross  out  items  1.6.  through  1.7.  and  1.9.  through  1.15. 

1.6.  Use — select  ion/assignment.  This  category  refers  to  the  use  made  by  the 
researcher(s)  of  the  prior  contact  information.  If  "0"  was  coded  in  1.5./ 
code  "0".  If  the  report  indicates  that  prior  contact  was  assessed/  but  no 
further  mention  of  that  information  is  made/  code  "1".  If  information  about 
prior  contact  was  only  used  to  describe  the  sample/  with  no  further  use  made 
of  it/  code  "2".  E.g./  if  the  data  are  also  used  for  assignment  /  or  are  only 
reported  incidentally  in  reporting  analyses/  do  not  code  "2".  If  prior 
contact  differences  between  groups  prior  to  treatment  are  checked  as  an 
assurance  -.^f  comparability/  code  "2"  and  indicate  the  result<s)  on  the  PRIOR 
CONTACT  COMMENTS  sheet.  If  coding  of  another  category  supersedes  the  coding 
of  "2"  when  such  comparisons  are  made/  note  the  comparisons  and  results  on 
the  COMMENTS  sheet.  In  some  cases /  researchers  will  use  information  about 
prior  contact  to  determine  who  will  be  in  the  sample.  For  example/  the 
researcher  may  exclude  anyone  who  has  had  prior  contact.  If  persons  with 
prior  contact  wer->  excluded  from  the  sample/  code  "3".    If  the  researcher 
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Table  2 

♦     Prior  contact  coding  alternatives. 


Effect  Sizes 


Situation 


Primary 


Non  P.C. 


Secondary 


P.C. 


1.  Prior  contact 
not  assessed. 

Code: 


2. 


Prior  contact 
assessed/  no 
prior  contact 
ES(s). 

Code: 


3.  Prior  contact 
assessed/  prior 
contact  ES(s). 

Code: 


On  P.C.  Coding 
Instrument/  #2/ 
4/  5/  &  8. 
Others/  "0" 
(not  applicable) 
except  x*s  in  9 


On  P.C.  Coding 
Instrument/  #2/  8/  9; 
rest  x's 


Entire  P.C. 
Coding  Instrument 
x's  in  9 


On  P.C.  Coding 
Instrument/  #2/  8/  9; 
rest  x*s 


Entire  P.C. 
Coding  Instrument 
x's  in  9 


On  P.C.  Coding 
Instrument/  #2/  8/  9; 
rest  x*s 


Original  coding 
instrument — A. 6. a. / 
b.  /  c« /  d. *  B.l.a./ 
b./  c.*  B.3.*  D.3./ 
D.4./  D.5./  D.6.; 
F.I.,  F.2.b.,  F.3. 
(all).  Copy  others 
from  previous  coding. 
Entire  P.C.  Codino 
Instrument. 
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uses  a  category  such  as  "no  and  minimal  contact",  code  "3".  If  only  persons 
with  orior  contact  were  included/  code  "4".  If  the  researcher(s)  categorized 
people  by  amount  of  coiitact  and  then  selected  subjects  from  those  strata/ 
code  "5**.  If  the  researcher(s)  stratified  by  amount  of  contact  and  assigned 
to  groups  from  the  rtrata/  code  "6".  Ii:  prior  contact  is  used  as  a 
covariate/  including  being  entered  first  in  a  multiple  regression  equation/ 
in  analyzing  posttest  scores/   code  "8". 

1.7*  Use — outcome  analysis/  correlation^  If  the  correlation  of  prior 
contact  with  posttest  scores  on  the  dependent  measure  for  a  treatment  group 
is  reported/  or  can  be  computed/  code  "2";  if  with  change  scores/  code  "3". 
If  the  correlation  between  prior  contact  and  posttest  scores  on  the  dependeni: 
measure  is  reported  for  treatment  and  control  (or  treatment  B)  groups/  code 
"4";  with  change  scores,  code  "5'^  If  "4"  or  "5"  can  be  coded/  do  not  code 
"2"  or  "3"-     If  "5"  can  be  coded,  do  not  code  "4". 

51  SA  ^^^^^  secondary  ESs>  Enter  the  number  of  new  prior  contact 
secondary  ESs*  Enter  x's  for  primary  ESs/  and  "O's"  for  secondary  ESs  if  no 
previous  contact  ESs  are  coded  for  the  study. 

1. 10.  Type  of  assessment.  If  "0"  was  coded  in  1.5./  code  "0"o  If  the  repojt 
only  mentions  prior  contact/  but  does  not  tell  how  it  was  assessed;  code  "1". 
If  the  researcher  used  a  demographic  or  biographical  questionnaire/  code  "2"; 
if  the  information  was  obtained  during  an  ^'nterview/  code  "3".  If  the 
researcher  assumed  contact  because  of  the  setting  in  which  the  Ss  had  been 
prior  to  the  research/  such  as  having  bee""  in  a  mainstreamed  classroom  or 
school/  working  as  a  psychiatri  aide  in  a  hospital/  or  being  a  parent  of  a 
disabled  child/   code  "4".     If  the  researcher  takes  infonmation  that  students 
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have  provided  on  a  questionnaire  or  interview  and  recategorizes  it  or 
^  reclassifies  it  to  come  up  with  contact  categories^  code  "5".    If  "5"  is 

coded,  then  do  not  code  "2",  "3",  or  "4". 

1. 11.    Type  of  prior  contact  setting.    Unless  "4=setting"  was  coded  in  1.6./ 

my 

^  code  "0".    Categories  "1"  and  "2"  are  for  students,  and  categories  "3"  and 

'•4"  are  for  teachers.  If  the  prior  setting  was  as  aides  or  nurses  in  a 
mental  hospital  or  institution  for  mentally  retarded  persons,  code  "5".  If 
the  information  provided  is  that  the  contact  was  in  a  family  setting,  for 
example,   as  parents,   code  "6". 

^  1.12.    Definition  o;:  degree  of  contact.    If  "0*'  was  coded  in  1.5. ,  or  if  the 

ES  is  for  quality  (1. 11.),  code  "0".  If  the  survey  or  questionnaire  used  to 
assess  prior  contact  is  available,  examination  of  the  items  on  it  is  the 

^  preferred  v;ay  to  determine  the  definition  of  contact.    If  "5"  was  coded  in 

1.6. ,  code  1.8.  in  terms  of  the  definition  of  the  classification  levels.  If 
contact  is  assessed  in  terms  of  the  general  amount  of  contact,   such  as  none, 

1^  some,  a  great  deal,  code  "2".     If  contact  is  assessed  in  terms  of  the  number 

of  disabled  persons  with  whom  the  person  n  had  contact,  code  "3".  If 
contract  is  assessed  in  terms  of  frequency  of  contact,  that  is,  how  many  ti  3S 

^  the  person  has  had  contact  with  disabled  persons,  score  "4".  If  contact  is 

assessed  in  terms  of  length,  such  as  number  of  hC'urs,  days,  years,  of 
contact,  code  "5".    If  contact  is  coded  in  ter  ..s  of  intensity,  such  as  "no" 

9  contact  and  "close"  contact,  code  "6".     If  contact  is  assessed  in  terms  of  a 

dichotomy  that  cannot  be  coded  in  2-6,  such  as  "knows  disabled  persons  or 
not",    code  "7".     Or,    if  contact   is  assessed   in   terms  of   the  type  of 

•  relationship — such  as  friend,   family  member,  code  "8". 
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1.13,  Quality  of  contact.  Here  the  question  is  whether  the  intent  was  to 
assess  whether  the  Ss"  prior  contact  was  a  positive  or  negative  experience 
for  them.    Code  "0"  if  ''0=No"  was  coded  in  I,5. 

1.14,  Direction  of  quality  comparison.  This  category  is  applicable  only  to 
within  treatment  prior  contact  ESs  (coded  "13"  in  A,6,d,)  which  involve  an 
assessment  of  the  quality  of  prior  contact.  For  all  other  ESs,  code  "0"  (not 
applicable).  Code  "1",  "2",  or  "3"/  depending  on  the  comparison  in  the 
within  treatment  ES, 

1.15,  Type  of  disability.  Here/  code  the  disability  or  disabilities  of  the 
persons  with  whom  the  prior  contact  of  the  Ss  was  assessed.  The  categories 
are  the  same  as  for  C4,d,,   and  conventions  are  also  the  same. 
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J.  CONTACT  CODING  SHEET 
The  purpose  is  to  code  studies  in  which  contact  is  the  treatment  to 
determine  if  attributes  of  contact  are  related  to  outcomes. 

J.5.  Status.  This  set  of  categories  is  included  to  get  at  the  variable  of 
relative  status  for  the  nondisabled  and  disabled  persons  involved  in  the 
contact. 

J. 5. a.  Age.  The  question  here  is  whether  there  is  sufficient  age 
difference  to  affect  status  as  perceived  by  the  nondisabled  participants. 
For  example/  for  young  children  in  particular,  a  grade  level  difference  is 
likely  to  be  significant  (e.g.,  4th  vs.  5th  grade).  If  both  disabled  and 
nondisabled  Ss  appear  to  be  about  the  same  age  or  age  differences  are 
distributed  evenly — e.g.,  all  are  crimpers  or  colle^^a  students — code  "2".  In 
mental  hospitals,  patients  may  be  a  variety  of  ages,  so,  for  example,  it 
cannot  be  assumed  that  they  are  older  than  or  the  same  age  as  college  Ss.  In 
such  cases,  unless  specific  age  indicators  are  provided,  code  "0".  If  the 
ages  are  specified  but  there  is  a  variety  of  age  relationships  (e.g.,  some 
nondisabled  college  Ss  work  with  children,  some  with  adults),  code  "4". 

J.5.b.  Educational- vocational  prestige.  Here  code  the  relative  status 
accorded  by  obvious  educational  achievein>^nt  (as  distinct  from  age)  and/or 
vocational  differences,  as  perceived  by  the  nondisabled  Ss.  For  example,  if 
a  disabled  professor  lectures  to  a  group  of  students,  code  "3".  If  the 
nondisabled  person  is  a  nurse  working  with  disabled  patients,  code  *'l".  If 
nondisabled  Ss  c^rz  tutoring  or  otherwise  in  contact  with  disabled  students, 
code  "1"  only  if  the  *e  is  an  obvious  educational  distinction,  such  as  college 
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students  tutoring  high  school  or  younger  studentSf  or  the  disabled  persons 
being  tutored  or  in  other  situations  are  obviously  educationally  hampered 
(e.g.,  severely/profoundly  retarded — TMR)  so  as  to  be  au  a  lower  level  than 
the  nondisabled  Ss.  That  would  not  be  true,  e.g.,  for  kindergarten  Ss  and 
TMR  peers.  If  tutors  are  different  in  age,  but  close  in  grade  level,  code 
"2".  In  settings,  such  as  a  mental  hospital  where  patients  may  come  from  a 
variety  of  backgrounds,  code  "0",  unless  specific  indicators  are  provided. 

J.5.C*  Helping  relationship.  This  category  is  intended  to  address  whether 
the  nondisabled  persons  are  providing  help  of  a  personal  sort,  in  an  one-to- 
one  relationship,  to  disabled  persons.  Excluded  would  be  situations  where 
the  relationship  is  of  a  general,  group  nature,  not  one-to-one,  such  as 
cooperative  learning  groups  (code  "0",  and  pick  up  the  nature  of  cooperation 
in  7.C.).  For  the  "helping  professions"  (medical,  psychological  counseling, 
rehabilitation,  special  education  teachers  or  other  teachers  working  closely 
with  disabled  student)  code  "1".  For  trainees  in  a  professional  field  in, 
e.g.,  a  practicum,  code  "2".  If  professional  persons  such  as  psychiatric 
nurses,  were  in  an  inservice  program,  code  "1".  Coded  "0"  are,  e.g., 
educational  relationships,  such  as  a  disabled  professor  teaching  a  class  or  a 
disabled  person  coming  into  can  elementary  school  to  tell  stories  to  the 
children.  Nonprofessional  ("3")  includes  helping  relationships  such  as 
volunteers  (including  companions)  in  mental  hospitals,  volunteers  in 
recreational  programs  who  provide  assistance  to  the  disabled  persons  rather 
than  just  participating  in  recreational  activities  with  them,  and  tutors. 

J.5.d.  Overalls  Based  on  your  coding  of  J.5.a.-J.5.c.  and  any  other 
information  in  the  report — for  example,  indications  ot  low  social  status 
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based  on  socio-gram  data  or  on  other  elements  in  the  situation  such  as 
special  treatment  due  to  the  disability— decide  what  the  relative  status  of 
the  disabled  and  nondisabled  persons  was  in  the  contact  situation  and  code 
accordingly.     CoJe  "0=can^t  tell"  only  ar  a  last  resort. 

J.7.  Type  of  contact,  a.  Voluntariness.  Here  the  question  is  whether  or 
not-  the  nondisabled  Ss  volunteered  to  have  contact  with  disabled  persons.  For 
example/  if  students  are  arbitrarily  assigned  to  a  contact  group  or  are  in  a 
group  chosen  to  be  a  contact  group/  code  "1".  If  there  is  only  one  group  in 
a  pre-post  design  and  students  do  not  volunteer  to  have  contact/  code  "1'*/ 
unless  *'2"  or  M"  is  appropriate.  If  persons  volunteered  for  a  professional 
role  that  they  knew  would  include  cor  tact  (such  as  deciding  to  be  a 
psychiatric  nurse/  m  employee  in  an  institution  for  mentally  retarded 
persons/  or  a  special  education  teacher)  or  for  a  preprofessional  role  that 
would  involve  contact/  such  as  signing  up  for  a  college  course  that  they  knew 
included  a  practicum  with  disabled  persons/  code  "2".  If  the  design  included 
Ss  who  voluntarily  chose  to  have  contact/  for  example/  choosing  to  be 
companions  to  mentally  ill  or  mentally  retarded  persons  when  that  is  not  a 
part  of  a  professional  or  preprofessional  role/  code  "3".  If  within  an 
assigned  group/  Ss  may  or  may  not  have  had  contact/  depending  on  their  own 
voli*  .on — such  as  in  a  playground  situation  in  which  some  nondisabled 
children  may  choose  to  play  with  disabled  children  while  others  may  choose 
not  to — cod  3  "4". 

J*7.b.  Intimacy.  Contact  can  vary  in  how  personal  it  is.  For  example/ 
observing  mentally  retarded  persons  in  an  institution  or  being  lectured  to  by 
a  disabled  person  typically  involves  no  interaction  and  should  be  coded  "1". 
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Tutoring,  playing  together  on  the  playground,  studying  together,  or 
participating  in  a  discussion  led  by  a  disabled  person  should  be  assumed  to 
be  "casual"  contact  (code  "2"),  unless  the  report  indicates  otherwise. 
Having  a  disabled  person  into  one's  home  or  sharing  a  cabin  at  a  camp  would 
be  coded  "3",  unless  there  is  evidence  in  the  report  to  the  contrary. 
"Varied"  contact  ("4")  is  coded  when  all  Ss  likely  had  some  concact,  but  of 
varying  types,  such  as  in  a  typical  mc.instreamed  classroom  or  a  dormitory 
situation  in  which  some  Ss  have  disabled  roommates  and  others  don't.  In  a 
situation  where  tliere  is  potential  for  contact  but  no  assurance  that  all  Ss 
had  contact  with  disabled  persons,  much  ler.s  interacted  with  them,  as  on  an 
integrated  university  campus,   code  "5". 

J.7.C.  Cooperat ion-compet i t ion.  In  some  contact  situations,  typically  those 
that  lack  interaction,  there  is  no  opportunity  for  cooperation  or 
competition — for  example,  in  observing  mentally  retarded  persons  in  an 
institution  or  in  being  lectured  to  by  a  disabled  person.  In  such  cases, 
code  "1".  Some  treatment  situations  in  which  there  are  opportunities  for 
interaction  do  not  necessarily  call  for  cooperation  or  competition  between 
the  disabled  and  nondisabled  persons,  and  either  or  both  might  have  occurred. 
Examples  would  include,  livinn  together  in  a  dorm,  simply  being  together  in  a 
classroom,  free  play  on  the  playground  (recess),  or  care  f  r  disabled  persons 
in  a  hospital  or  irstitution.  In  such  cases,  code  "2".  In  sone  contact 
.  studies,  cooperation  is  necessary  for  treatment  implementation,  but  it  is  not 
the  focus  of  the  treatment.  Examples  incluuo  tutoring,  clo^sroom 
discussions,  being  a  companion,  or  volunteers  helping  with  a  recreation 
program.  Code  such  situations,  "3".  if  coopei  ^tion  is  an  explicit  part  of 
the   treatment,    as  when   the  Ss  are   t  )ld  to  cooperate  as  part  of  the 
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intervention — e.g.,  groups  are  told  that  their  rades  depend  on  a  shared 
effort — code  "4".  If  there  is  both  implied  and  explicit  (assigned) 
cooperation,  code  "4".  If  competition  between  the  ncndisabled  and  disabled 
persons  is  part  of  the  natural  setting — e.g.,  playing  a  sport  to  win  or 
competition  for  grades — code  "5".  If  Ss  are  told  to  compete  as  part  of  the 
intervention — e.g.,  points  are  assigned  for  performance,  with  a  prize  for  the 
one  with  the  most  points — code  "6".  If  there  is  both  implicit  and  explicit 
(assigned)  competition,  code  "6".  If  there  is  any  combination  of  "3"  and/or 
"4"  with  "5"  and/or  "6",  code  "7".    Code  "can't  tell"  only  as  a  last  resort. 

J.7.d.  Reinforcement.  If  there  was  likely  some  shared  intrinsic  reward  for 
common  performance  of  a  task  or  participation  in  an  activity  (evidenced,  for 
example,  by  mention  of  satisfaction  with  a  job  done),  code  "2".  If  the 
"contaccees"  share  a  reward  given  by  someone  else,  such  as  a  prize,  code  "3". 
If  only  the  nondisabled  person  is  reinforced,  e.g.,  by  praise,  for  having 
contact,  code  "4".  Because  external  rewards  are  more  easily  ido.itif iable, 
but  are  likely  to  also  involve  intrinsic  satisfaction  from  achievement,  "3" 
takes  precedence  over  "2"  in  coding.  Follow  that  convention  in  coding  "5"  or 
"6". 

J*7.e.  Pleasantness.  Here  the  point  is  whether  the  nondisabled  persons 
found  the  contact  situation,  not  only  the  contact  with  disabled  persons,  to 
be  pleasant  or  not.  For  example,  if  college  students  indicated  that  a 
practicum  was  "an  excellent  learning  experience"  that  is  evidence  for  coding 
"2" — unless  they  also  indicate  that  what  was  excellent  was  learning  to  handle 
one's  negative  feelings  about  a  situation.  Don't  score  "1"  or  "2"  without 
soT.e  indicator,  such  as  responses  on  a  questionnaire  or  during  an  interview. 
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Don't  simply  assume  without  evidence/  e.g./  that  contact  with  a  profoundly 
mentally  retarded  person  is  unpleasant  or  that  common  participation  in  a 
cooperative  group  is  necessarily  pleasant. 

J.7.f,  Modeling.  The  question  here  is  whether  others  model  positive 
interactions  with  disabled  persons  for  the  nondisabled  Ss  as  part  of  the 
treatment  or  treatment  setting/  wheth^i;  the  modeling  was  in  a  natural  setting 
or  staged.  A  "significant  other"/  coded  "3"/  would  be,  e.g./  a  teacher. 
Score  "0"  ("can't  tell)  unless  there  are  indicators  that  modeling  took  place 
or  that  modeling  was  not  a  part  of  the  treatments 

J.8.  Ins t i tut ional/aut hori t y/peer  support.  Code  here  whether  the  norms  that 
governed  interactions*  the  persons  in  authority  whr.re  the  contact  occurred/ 
or  the  statements  and  behaviors  of  peers  likely  promoted  positive  attitudes 
toward  disabled  persons.  Look  for  specific  evidence  that  the  nondisabled 
pers-  ns  were  avare  of  the  norms  (such  as  reports  of  explicit  statements  to 
them  by  teachers)  or  peers*  negative  attitudes  (such  as  statements  by  the  Ss 
themselves).  Or  look  for  evidence  that  such  norms  were  not  likely/  as  in  a 
carefully  controlled  piece  of  research  in  which  the  Ss  are  kept  olind  as  to 
the  purpose  of  attittide  modification.  If  such  evidence  is  not  available/ 
code  "0". 

J.9.  Characteristics  of  disabled  persons,  a.  Disability.  As  type  and 
severity  of  disabi*.  may  be  a  factor  in  contact  effects/  the  purpose  here 
is  to  code  disabilities  as  specifically  as  possible.  Use  the  "can't  tell"/ 
"0",  category  only  as  a  last  resort.  If  several  types  of  disabilities  were 
present  and  are  specified/  code  "1"  and  enter  the  code  numbers  for  the 
disabilities.     See  Table  2  disabilities.     If  there  are 
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0 

can't  tell 

1 

= 

combination 

2 

multiple  disabilities 

3 

m,  can't  tell 

4 

= 

WR/  general 

5 

= 

Mk/  ndld/moderate  (e,g*,  IQ  35  or  40  to  70);  EMR 

6 

= 

MR,  severe/profound  (e.g.,  IQ  below  35  or  40);  TMR 

7 

severe  multiple  impaired 

8 

— 

physical/  general 

9 

wheelchair 

10 

= 

amputee 

11 

rz 

paraplegic 

12 

= 

facial  disfigurement 

13 

visually  impaired 

14 

blind 

15 

hearing  impaired 

16 

deaf 

17 

mentally  ill 

18 

emotionally  disturbed 

19 

learning  disabled 

20 

cerebral  palsy 

21 

speech  impaired 

22 

spinal  bifida 
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several  types  of  unspecified  disabilities/  e.g./  in  a  special  education 
classroom/  code  "2".  If  reference  is  to  mentally  retarded  or  developmentally 
disabled  persons/  but  you  cannot  discern  the  level  of  retardation/  code  "3". 
If  there  is  a  range  of  mental  retardation/  e.g./  moderately  to  severely 
retarded/  code  "4".  If  reference  is  made  only  to  "physical  disabilities"/ 
code  "8".  If  the  disabled  person  or  persons  are  in  a  wheelchair/  code  "9"/ 
unless  one  of  the  other  categories/  e.r^/  "10=amputee"  or  "20=cerebral 
palsy"/  can  be  coded  to  give  a  more  specific  indication  of  disability.  Note 
that  there  is  not  an  "other"  category.  If  you  encounter  a  disability  not 
included  in  the  list  on  page  3/  add  it  to  the  list  after  consultation. 

J.9.b.  Negative  stereotype.  A  concern  in  the  literature  is  whether  the 
disabled  persons  with  whom  there  is  contact  look  or  behave  such  as  to 
reinforce  negative  stereotypes  about  disabled  persons.  Don't  infer  simply 
from  the  disability/  e.g./  blindness/  that  negative  stereotyping  was  present/ 
but  look  for  clues  in  the  descriptions  of  Ss*  reactions/  the  disabled  persons 
(e.g./  paraplegics  playing  basketball  or  running  races  in  wheelchairs  are 
nonstereotypic  and  should  be  coded  "2"/  unless  there  is  specific  information 
to  the  contrary)/  the  treatment  (e.g./  mannerisms  to  be  displayed  by  the 
disabled  persons)/  and  the  situation  (e.g./  those  in  an  institution  for  the 
mentally  retarded  are  likely  to  be  severely/profoundly  retarded  and 
reflective  of  stereotypes/  so  code  "1"  for  visits  to  such  institutions.  The 
picture  is  not  so  clear  for  mental  institutions.  Some  persons  are  likely  to 
be  there  because  their  behavior  is  stereotypic;  others  may  appear  very 
normal.  Unless  there  is  evidence  o£  severe  psychiatric  cas<=»S/  code  "O".). 
If  disabled  persons  were  selected  purposely  to  avoid  negative  stereotypes/ 
code  "2". 
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J.9.C.  Competence,  The  questior»  here  is  whether  the  person's  disability  was 
likely  perceived  by  the  nondisabled  Ss  as  affecting  competence  in  regard  to 
either  (1)  a  specific,  identifiable  role  which  is  pact  of  the  treatment 
(e.g./  a  physically  disabled  lecturer  or  a  paraplegic  athlete)  or  (2)  general 
functioning.  Code  based  on  (1),  if  present;  if  not,  code  based  on  (2).  If 
the  disabled  person(s)  clearly  lacked  competence,  code  "1".  For  example,  in 
a  (1)  situation,  mentally  retarded  persons  given  an  experimental  role 
requiring  verbal  abilities  they  lacked  or  students  being  tutored  who  lacked 
the  skills  for  the  task  being  learned  would  be  coded  "1";  in  a  (2)  situation, 
mentally  retarded  students  in  self-con tainea  classrooms  who  interacted  with 
nondisabled  Ss,  mentally  retarded  persons  observed  in  institutions,  or 
patients  in  psychiatric  wards,  or  people  in  mental  institutions  would  be 
coded  "1".  However,  if  in  a  (1)  situation  there  wr.s  clear  acknowledgement 
and  acceptance  by  the  disabled  and  nondisabled  participants  of  any  disability 
limitations  in  performing  a  task/  code  "2".  If  the  cisabled  person  is 
likely  perceived  as  competent/  either  because  the  disability  was  irrelevant 
to  a  task  at  which  the  disabled  person  is  competent  (such  as  a  lecturer  ir^  a 
wheelchair)  or  because  the  disabled  person  performed  well  a  task  for  which 
the  disability  would  ordinarily  be  considered  an  impairment  (e.g./  a 
wheelchair  athlete)/  code  "3".  In  some  situations,  it  may  be  difficult  to 
discern  whether  the  disablity  was  relevant  to  the  nondisabled  persons' 
perceptions  of  the  competence  of  the  disabled  persons.  Examples  would  be  a 
physically  disabled  elementary  student  in  a  classroom  setting/  or  mentally 
r3tarded  persons  in  a  recreation  program  in  which  they  may  perform  some 
activities  well  but  not  others.     In  such  cases/  code  "0"/   "can't  tell". 
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J.IO.  Characteristics  of  nondisabled.  u.  Personality  related.  If 
potentially  relevant  personality  traitS/  such  as  I,Q.  or  authoritarianism, 
were  not  assessed/  code  "0".  If  a  personality  trait  was  assessed,  but  its 
relation  to  attitude  outcomes  was  not  tested,  code  "1",  If  the  relation  was 
tested  (e.g.,  in  a  factorial  ANOVA  or  through  a  correlation  with  posttest 
scores  on  the  dependent  measure(s)  for  a  treatment  group)  and  the  author(s) 
indicated  there  was  none — or,  in  the  absence  of  that,  the  relation  was  not 
statistically  significant  at  the  .05  level — code  "2",  If  there  were 
different  results  on  multiple  personality  measures  or  different  results 
across  groups,  code  "3",    If  there  was  a  relationship,  judged  as  above,  code 

J.lO.b.  Prior  attitudes  related.  Code  as  with  lO.a.,  now  attending  to  pre- 
post  dependent  measure(s). 
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META-ANALYSIS:    MODIm^JG  ATTITUDES  TOWAEWD  DISABLED  PERSONS 

COMPUTATION  OF  EFFECT  SIZES 
Two  types  of  effect  sizes  (ESs)  will  be  used  in  this  meta-analysis:  (1) 
the  standardized  mean  difference,  the  result  of  dividing  the  difference 
between  two  means  by  an  appropriate  standard  deviation;  and,  (2)  correlation 
coefficients  which  express  the  degree  of  relationship  between  the  independent 
and  dependent  variables. 

The  Standardized  Mean  Difference  (£) 
At  least  two  types  of  standardized  mean  difference  have  been  presented 
in  the  literature.  One  is  Glass's  Delta,  in  which  the  difference  between  a 
treatment  and  a  control  mean  is  divided  by  the  control  group  standard 
deviation;  the  other  is  Cohen's  d,  which  is  the  difference  between  a 
treatment  and  control  mean  divided  by  the  pooled  standard  deviation  for  the 
treatment  and  control  group.  We  use  the  symbol,  £,  on  the  coding  instrument 
for  the  standardized  mean  difference  to  be  used  in  this  study. 

The  rationale  behind  the  standardized  mean  difference  as  an  effect 
size  is  to  compare  the  mean  of  subjects  in  one  group  (e.g.,  the  treatment 
group)  to  the  mean  of  the  subjects  in  another  ^oup  (e.g.,  the  control  c,roup) 
relative  to  the  dispersion  of  subjects  who  have  not  received  any  treatment. 
The  control  group  standard  deviation  is  one  legicimac?  casis  fcr  eciiT.^i.r.^ 
the  dispersion  in  a  non-treated  population.  On  the  other  hand,  if  the 
pretest  standard  deviations  for  the  treatment  and  control  groups  are  also 
available,  use  of  this  additional  information  will  yield  a  more  stable 
estimate  of  the  standard  deviation  in  the  untreated  population.  The  guiding 
rule  is  to  obtain  the  most  stable  estimate  possible,  excluding  treatment 
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effects  if  possible.  Once  the  estimate  of  the  standard  deviation  is  ctotained 
for  a  dependent  measure/  use  it  in  computing  all  D*s  for  that  measure. 

In  this  meta-analysis,  the  first  choice  for  the  standard  deviation  in 
computing  a  standardized  mean  difference  will  be  the  posttest  standard 
deviation  for  the  control  or  placebo  group(s)  pooled  with  the  pretest 
standard  deviations  for  the  treatment  and  control  (or  placebo)  groups.  If 
pretest  standard  deviations  are  rot  available,  then  the  control  group 
standard  deviation  (sd)  on  the  post  assessment  will  be  used,  if  available. 
In  a  COVAR  analysis  (or  when  residual  gain  means  are  used)  the  only  sd. 
available  may  be  for  the  adjusted  (or  residual)  scores.  If  r^^  is  available, 
use  it  to  estimate  the  sd  for  unadjusted  scores.  If  not,  use  r  =  .50  to 
estimate  the  sd  for  unadjusted  scores. 

When  the  only  information  available  is  from  an  analysis  of  variance 
table,  a  pooled  standard  deviation  including  the  treatment  group(s)  will  be 
used.  With  a  pre-post,  one-group  design,  tne  pretest  standard  deviation 
should  be  used.  And,  with  a  Treatment  A  vs.  B  design,  a  pooled  standard 
deviation  for  the  pretest  is  preferred;  if  that  cannot  be  computed,  then  a 
pooled  Treatment  A  and  B  posttest  standard  deviaf.ion  will  be  used.  (See 
Table  4  for  a  summary  of  sd  choices  and  formulae.) 

Whatever  standard  deviation  is  cnosen,  based  on  the  availability  of 
information,  the  same  standard  deviation  is  to  be  used  for  computing  all 
standardized  mean  differences  for  the  '^?.Tr'!i'7*2l?*r  ^<^^ckry^an^  v?Ti?.ble  in  the 
study. 

The  question  of  interest  is  whether  the  status  of  the  groups  on  the 
posttest  indicates  any  evidence  of  a  treatment  effect.  However,  a  frequent 
contaminating  factor  will  be  the  existence  of  pre-treat ment  differences 
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'rable  4.    Choice  of  standard  deviation 
for  computing  D 


ORDER  OF  PREFERENCE 


RELEVANT  FORMULAE 


For  treatinent  vs,  control  or  placebo : 

1.  Pcsttest  sds  for  placebo  and/or 
control  group(s)  pooled  with  pre- 
treatment  sds  for  all  groups. 

2.  Posttest  sd(s)  for  placebo 
and/or  control  group(s). 

3.  Sd  estimated  from  control  group(s) 
sd(s)  for  gain/  adjusted/  or  res- 
idual scores. 

4.  Pooled  sd  from  an  ANOVA  including 
the  treatment  o^oup(s). 

5.  Estimated  pooled  sd  from  a  COVAR 
including  the  treatment  group(s). 


1-  n 


2.  n 

3.  tf3/  4 

4.  iS,  5,  8 
5. 


For  pre- post/  one-group 
design:  pretest  sd. 


For  Treatment  A  vs.  B : 

1.  If  part  of  a  design  which  includes 
a  control  and/or  placebo  group(s)/ 
the  Sc*me  as  for  the  treatment  vs. 
control  or  treatment  D  (1-5  above) 

2.  If  a  Treatment  A  vSv  B  only  design/ 
use: 

a.  Pooled  pretest  sd. 

b.  Pooled  posttest  sd. 

c.  Estimated  sd/  if  adjusted 
or  residual  scores  used/ 
as  in  3  and  5  above. 


1.    See  1-5  above. 


2a.  n 
2b.  n 

2c.    See  3  and  5  above 
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between  the  groups,  even  with  random  assignment.  For  that  reason ,  when 
possible/  the  means  used  in  computing  the  standardized  mean  difference  should 
be  adjusted  for  pre-treatment  differences.  The  first  preference,  if  the 
necessary  information  is  available  in  the  research  report/  is  to  compute  and 
use  raw  gain  means  (that  is,  the  posttest  mean  minus  the  pretest  mean  for 
each  group).  If  the  means  of  change  scores  are  reporced,  they  equal  the  raw 
gain  means.  The  second  preference  will  be  adjusted  means  from  an  analysis  of 
covariance.  The  third  preference  will  be  residual  gain  scores.  The  final 
preference  will  be  the  simple  unadjusted  posttest  mean  scores.  (With  each 
type  of  mean  difference,  use  the  standard  deviation  as  defined  above,  or  an 
estimate  of  it.  Do  not  use  the  reduced  standard  deviation  for  ':he  adjusted 
scores.) 

Whenever  possible,  the  standardized  mean  difference  (D)  should  be 
computed  directly  from  information  provided  in  the  report.  When  that  is  not 
possible  because  of  a  lack  of  information,  can  be  estimated  from  statistics 
which  are  frequently  reported,  such  as  t-  or  F-  ratios,  and  failing  that, 
from  precisely  reported  significance  levels. 

To  summarize,  standard  deviations  are  preferred  in  the  following  order: 
(see  Table  4  for  more  detail) 

1.  Pooled  post-control  and  pre-treatment 
and  control  standard  deviations. 

2.  The  post-control  standard  deviation. 

3.  A  pooled  standard  deviation  which  includes 
the  treatment  group. 

The  order  of  preference  in  regard  to  means  to  be  used  is: 

1.  Raw  gain  means,  or  niean  change  scores. 

2.  Covariance  adjusted  posttest  means. 

3.  Residual  gain  means. 

4.  Unadjusted  posttest  means. 


The  preference  is  always  to  compute  the  standardized  mean  difference  (D) 
using  nieans  and  one  of  the  three  standard  deviations;  if  such  information  is 
not  available  in  the  report/  then  estimate  the  standardized  mean  difference 
using  i,tatistics  such  as  t-  or  F^ratios  or  chi-jjquare  if  they  are  reported; 
or,  if  such  statistics  are  not  reported  but  a  precise  significance  level  is^ 
use  it. 

Calculation  of  D 

/J  -  ^^lZI^  in) 

1.    Computation  using  means  and  sd. 
a.  Means 

1)  If  pretest  and  posltest  means  are  available/  subtract  to  obtain 

mean  gain;  or,  if  change  score  means  are  available/  use  them. 

2)  If  pre-post  means  are  not  available/  use,  in  order  of  preference: 
(a)  covariance  adjusted  means;  (b)  residual  means  (i,e./  scores 
were  predicted,  usually  basad  on  pretest  scores/  and  the  obtained 
minus  the  predicted  score  used  as  the  dependent  measure);  or,  (c) 
unadjusted  posttest  means. 

b.    Standard  deviations 

1)  If  posttest  control  group  and  pretest  treatment  and  control 
groups  sd's  are  available  (with  multiple  treatments,  more  than  3 
sd's  may  be  available),  obtain  a  pooled  sd,  using  the  following 
formula  (noce  that  sd's  are  changed  to  variances  for  this 
computation) : 


.  ^   (#2) 

where  s,2  is  the  var ranee  of  group  1,  n,  is  the  number  in  Group 
1,  and  k  is  the  total  number  of  groups.  If  you  have  multiple 
sd's  to  pool  and  one  is  an  extreme  outlier,  do  not  irclude  it  in 
the  pooled  sd. 

2)    If  sd's  are  not  available  aq  in  1)  above/  use  the  control  group 
posttest  sd.    You  may  have  to  estimate  it  in  the  following  ways: 
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a)  If  gain  scores  are  used  and  a  standard  deviation  for  gain 
scores  (sdq)  for  the  control  group  is  provided  along  with 
the  pre-po^test  correlation/  then 


b) 


(G,   M,   S,    p.  118)* 


If  Tjjy  is  not  reported/  use         =  .50. 


xy 


With  residual  or  covariance  adjusted  scores  and  a  standard 
deviation  for  control  group  residual  scores  (sdj-gg)  and  the 
pre-posttest  correlation  available: 


=  X 


(G,  M,  S,  p.  118) 


If  the  rjjy  is  not  available/  use  r^^  =  .50. 

3)      If  the  necessary  standard  deviations  are  not  available  to  get  a 
"treatment-free  standard  deviation/  use  the  following: 

a)    With  a  one-way  ANOVA: 


(if3) 


(#4) 


(#5) 


b)    With  a  one-way  ANOVA/    if  only  F  (and  not  a  source  of 
variance  tiible)  and  group  mear.s  are  provided/  compute: 


Ct.>n<l  -    '^^/^-    (G/  M/  S/  p.  128) 


(^6) 


If  X  (the  grand  mean)  is  not  reported/  compute  it  as  a 
weighted  avtrage  of  the  group  means: 


N 


(#6A) 


c)    With  a  one-way  GOVAR/  and  one  covariate: 


If  there  is  more  than  1  covariate/  the  d.f.„  cerms  must  be 
decreased  by  1  more  for  each  additional    covariate.    If  r 
is  not  available/  use  rj^y  =  .50. 


xy 


♦Sources  of  specic.l,  formulae  are  included.    G/  M/  S  =  Glass  et  al./  1981; 
Hays  =  Hays,  1973;  White  =  Karl  White,  unpublished  materials. 


(#7) 
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With  an  n-way  ANOVA  (a  factorial  design)  or  COVAR,  all 
sources  but  that  for  the  treatment  must  be  pooled:  so,  for 
a  2  X  2  design/,  with  the  treatment  the  A  factor: 


^  ^^A8  (#8) 
jr     .  jr       ,jr  (G/  M,  S,  p.  119) 


Using  an  inferential  statistic  to  estimate  ^  when  sd  cannot  be 
obtained. 

Sometimes  when  a  mean  difference  is  not  statistically 
significant,  the  report  will  not  indicate  which  group  had  the 
higher  mean.  In  that  case,  even  though  £  might  be  obtained,  no 
sign  could  oe  attached  to  the  £;  it  would  be  meaningless  for  our 
purpc;ses,  and  shouli  not  be  coded, 

a.  t-test  for  independent  grrups: 

'^1  '  (G,  M,  S,  p.  125) 

b.  t-test  for  correlated  groups  (matched  pairs): 


0-  t\^^ (l-C)  ,  "^'^ 


(G,  M,  S,  p.  125) 


If  is  not  available,  use  r^^  =  .50. 
c.  -  t--test  comparing  gain  scores: 


t  h(f-  >-^^)('/n,  -h'/,^)  iG,  M,  S,  p.  127) 


If         is  not  available,  use  r^,  =  .50. 

Ay  xy 

d.    F  from  a  one-way  ANOVA  with  2  levels: 
with  n^^  =         use  formula  #9,  or 


With  ni  take  F  =  t  and  use  formula  #9. 

e.    F  from  a  one-way  OOVAR  with  2  levels  and  1  covariate: 


^  2  fJ7f^)r-j/.  -;) 

D  "     \j    Cn,^/7^)  s,  p.  127) 
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(#9) 


(#10) 


(#11) 


-  '  '  '  (White) 


(113) 


If        is  not  available/  use        =  .50. 

If  there  is  more  than  1  covariate/  d.f.^  tern:s  must  be 

adjusted  by  1  more  for  each  additional  covariate.  If  r^  is 
not  available/  use  r^^  =  .50. 

f.    n-way  mDMh  wich  2  levels  of  the  treaLment  factor: 

All  sources  of  variation  other  than  the  treatment  must 
be  collapsed/  an  adjusted  MS^  computed  (add  SSs  and  add 
d.f.s./  and  divide  SSs  by  d.f-s.  to  get  adjusted  MS  ),  and 
an  adjusted  F  computed/  which  is  the  MSq  divided  by  the 
adjusted  MS^.    Then,  if  n]_  =  n2/  use  formula  }f9  or 


D 


(if  14) 


If  take       =  t  and  use  formula  {f9. 

3.  Using  level  of  statistical  significance  when  an  sd  or  test  of 
significance  is  not  available  and  an  exact  p  value  and  n's  or  d.f.  are 
reported : 

Find  the  exact  level  of  statistical  significance  for  the  specific  d.f. 
in  the  appropriate  table  and  read  off  the  corresponding  statistic  (e.g./ 
t)/  then  proceed  as  in  2.  above. 

4.  Using  nonparametric  test  statistics/  other  than  chi-square/  when  2 
groups  are  compared: 

Substitute  for  the  nonparametric  statistic  (e.g./  the  U  from  a  Mann- 
Whitney  test)  the  value  of  t  with  the  equivalent  level  of  significance/ 
and  proceed  as  in  2.  above. 

5.  When  results  for  two  groups  are  available  in  proportions  or  percentages/ 
with  the  statistical  significance  of  the  difference  in  proportions 
tested  using  either  Z  or  chi-square: 

Probit  transformations  (finding  the  standard  normal  deviates  of  the 
:i'.ors  and  taking  che  difference  between  them  as  an  estimate  of  D) 
have  commonly  been  used  in  this  situation.  An  alternative  which  seems 
to  produce  more  realibcic  results  (i.e./  fewer  high  D*S/  out  of  line 
with  the  D's  from  means  and  standard  deviations  and/or  t*s)  is  to 
compute  PhT  and  use  it  to  estimate  D. 

In  the  case  of  proportions  for  two  groups/  the  statistical  significance 
of  the  difference  can  be  tested  using  Z  or  chi-squared/  with  identical 
results/  because  with  1  d.f./  the  square  root  of  chi-squared  is  a  normal 
deviate/   so  Z^^chi-squared* 
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First/  compute  Phi,  using  either  of  the  following  formulae: 


ERIC 


Then/  compute  D/  using  the  following  formula: 

Note:      If  the  chi-squared    has  more  than  one  d,f,/  neither 
phi  or  D  can  be  computed. 


6,      When  a  peine  biserial  r  is  available/ 


Correlation  Coefficients 

Effect  sizes  can  be  expressed  in  terms  of  the  degree  of  association 
between  group  membership  (e,g./  treatment:  anc?  control  group  membership)  and 
scores  on  the  dependent  variable.  For  all  instances  except  interaction 
effects,  ESs  for  this  meta-analysis  will  be  recorded  both  as  D's  and 
correlation  coefficients. 

For  the  straightforward  design  which  involves  the  comparison  of  two 
groups  on  a  posttest/  the  point  biserial  correlation  (i^p^)  is  the  coefficient 
of  choice.  The  rp|^  is  also  appropriate  for  pre-post  or  post-delay 
comparisons.  When  the  findings  are  in  the  form  of  percentages  (e.g«/  the  ^ 
of  the  treatment  group  with  favorable  attitudes/  as  compared  to  the  %  of  the 
control  group)/  a  Phi  coefficient  is  appropriate. 

For  interaction  effects/  a  D  will  not  be  available/  but  a  correlation 
coefficient  can  be  computed — Eta^.  Cramer's  V/   an  extension  of  Phi/     may  be 
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(ns) 


appropriate  for  some  tables  that  are  larger  than  2-by-2.  (V  has  the 
advantage  over  the  contingency  coefficient/  C,  in  th^t  it  can  attain  a  value 
of  1.00  and  can  be  compared  across  contingency  tables  with  varying  dfJs.) 

Calculation  of  correlation  coefficients 

If  t-^B        for        CErtiicular  5S  is  included  in  the  report^  use  it. 
If  rp5  is  not  reported/  it  can  be  obtained  in  the  following  ways: 
•i)    With  a  D  as  an  effect  size  for  a  comparisoii  of  means; 

b)    With  an  ANOVA  with  only  2  levels  of  the  treatment/  when  a  source 
of  variance  table  is  provided/  divide  the  sum  of  squares  for  the 
treatment  by  the    total    sum  of  squares  to  obtain  an  Eta^  (with 
2  groups/  Eta^sr  u^)/  and  take  the  square  root: 


•pb 


2.  Phi 


With  two  proportions/  a  Phi  can  be  computed  using  cither  Z  or  chi- 
squared: 


(#18) 


n/  (#19) 


2 

3.    Eta    for  an  interaction  effect 

As  noted  in  1.  above/  Eta^  is  a  ratio  of  the  sum  of  squares  for  a 
source  of  variance  to  the  total  sum  of  squares.  So, 


^    (#20) 


ERIC 


494 

476:  ES-10 


4.    Cramer's  V  can  be  computed  using  Phi  or  chi-squared: 


(Hays,  p,  745) 


with  L  the  smaller  of  R/  the  numbe '  of  rows/  or  C/  the  number  of 
columns. 

Calculation  of  Variance  Ratios 

The  point  of  calculating  a  variance  ratio  (s|/s^)  is  to  determine 
whether  the  treatment  has  had  an  effect  on  variability.  The  experimental 
group  posttest  variance  (remember  always  to  square  the  standard  deviation)  is 
divided  by  the  control  group  posttest  variance.  In  Treatment  A  vs  B  ESs/ 
divide  the  Treatment  A  posttest  variance  by  that  for  Treatment  B.  With  a 
pre-post/  single--group  design  ES/  divide  the  posttest  variance  by  the  pretest 
variance.  Variance  ratios  should  be  computed  only  if  the  standard  deviation 
or  variance  is  available  for  each  group.    Estimates  should  not  be  used. 
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META-A.'^WxI?IS:     M0DIFYI1«3  ATTITUDES  TOWARD  DISABLED  PCRSONS 

CONVENTIONS  ADDENDA 

1.  Omnibus  F«  If  an  omn-Shss  F  (for  3  or  more  means)  is  computed  and 
reported  as  statistically  "lonsignificant/  then  post  hoc  comparisons  of  all 
pairs  of  means  would  be  statistically  nonsignificant.  Consequently/  "1" 
("not  significant  at  .05  level")  would  be  coded  for  "F.  Results.  1. 
Statistical  Significance",  for  each  pair  of  means.  (Note  that  the  overall  F 
would  not  provide  the  information  necessary  to  compute  ESs.)  However,  if  an 
omnibus  F  for  3  or  more  means  is  statistically  significant,  nothing  can  be 
concluded  about  the  significance  of  differences  between  specific  pairs  of 
means,  and  "0"  would  be  coded.  If  means  and  standard  deviations,  or  means 
and  the  F,  are  available/  ESs  can  be  computed.  Otherwise,  unless  more 
information  can  be  obtained  from  the  author(s),  ESs  cannot  be  computed. 

2.  Dependent  measures. 

Interactions.  The  number  or  nature  of  interactions  of  non handicapped  3s 
with  handicapped  persons  may  be  a  dependent  measure  in  a  study  of  attitude 
change,  on  the  premise  that  changed  attitudes  will  be  manifested  in  changed 
(e.g.,  more  frequent,  more  positive)  behavior  by  nonhandicapped  Ss.  However, 
studies  in  which  ',he  aim  is  simply  to  change  behavior  per  se,  rather  than  to 
assess  behavior  as  a  possible  outcome  of  an  attitude  change  intervention  or 
to  change  behavior  as  a  means  of  changing  attitudes,  as  separately  assessed, 
are  not  relevant  for  the  meta-analysis,^ 

Social  status.  As  with  interactions,  the  assessment  of  the  social  status 
of  handicapped  persons  may  yield  a  dependent  measure  in  a  study  of  attitude 
change,  on  the  premise  that  changes  in  attitudes  will  be  reflected  in  changed 
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estimates  of  social  status.  However/  studies  in  which  the  aim  is  to  change 
social  status  per  se,  rather  than  to  assess  social  status  as  an  outcome  of  an 
effort  to  change  attitudes  or  to  change  social  status  as  a  means  of  affecting 
attitudes/  as  separately  assessed/  are  not  relevant  for  the  meta-analysis. 

Other  measure  >c  In  general/  a  measure — such  as  the  attractiveness  of 
disabled  persons  or  volunteering  to  work  with  disabled  persons — that  is  not  a 
straightforward  assessment  of  attitucf^es  is  not  to  be  included  in  the  meta- 
analysis  as  a  dependent  measure  unless  the  research  report  indicc^tes  the 
measure  was  considered  by  the  investigators  to  be  an  attitude  assessment. 
Then/  of  course/  its  validity  must  be  judged  (D.lO.e).  This  is  consistent 
with  the  treatment  of  the  Rucker-Gable  Educational  Programming  Scale  (RGEPS) 
specified  on  page  I>3  of  the  COtWENTIONS  FOR  USE  OF  CODING  INSTRUMENT. 

4.  Static  group  program  evaluations.  Some  of  the  reports  to  be  coded  will 
involve,  rather  than  planned  treatments  or  interventions/  post  hoc 
evaluations  of  instructional  programs/  e.g./  courses  of  study  for  prospective 
special  education  teachers  or  rehabilitation  counselors.  These  studies  will 
typically  involve  a  static  group  design  with  assessment  at  the  end  of  the 
program.  Program  students  and  students  from  other  programs  who  are  at 
comparable  point^s  in  their  education  will  typically  be  the  Ss.  In  the  Coding 
Instrument/  Category  C.  Treat  men t/ Intervention/  1.  Basis/  calls  for  a 
judgment  as  to  the  basis  for  the  treatment  or  intervention.  In  this 
particular  kind  of  research/  as  with  many  mainstreaming  studies/  there  is  not 
a  planned  treatment  or  intervent ioa  but  an  effort  to  determine  whether  an 
ongoing  program  is  having  some  effect.  For  that  reason/  and  to  be  able  to 
sort  out  this  type  of  program  evaluation  projects  during  the  analysis/  code 
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"6=other"  for  Category  C.l.  and  then  specify  the  type  program  being 
evaluated — e.g./  graduate  program  for  counselors.  The  same  line  of  reasoning 
will  apply  to  mainstrearning  studies  which  have  been  implemented  as  the  result 
of  public  policy  and  then  studied  on  a  post  hoc  basis*  If,  however,  there  is 
a  clear  ra'ionale  laid  out  for  the  program  and/or  expectations  for  program 
outcomes/  score  the  study  appropriately.  For  example/  if  there  is  a 
discussion  of  a  theoretical  basis  for  the  program/  or  if  the  type  of  program 
or  the  expectations  of  outcomes/  are  based  on  the  citation  of  prior  research/ 
code  as  "3"  or  "4"/  as  appropriate* 

5.  Interaction  ES's.  When  an  interaction  is  coded  (A*6.d.)/  only  an  Eta^ 
will  be  available  as  an  ES.  Moreover/  experimental  and  control  group 
information  is  not  codable  if^  there  are  more  than  two  groups.  In  that  case, 
for  categories  for  which  experimental  and  control  group  spaces  are  to  be 
filled  in,  enter  x*s  or  the  code  for  "not  applicable"/  if  there  is  one.  For 
the  other  categories/  enter  codes  as  usual. 

6.  Incidental  dependent  measures.  Include  in  ESs  only  measures  defined  or 
clearly  intended  by  the  researcher(s)  as  dependent  measures  which  assess 
attitudes.  That  is,  it  results  are  reported  on  measures  which  were  not 
considered  to  assess  attitudes  (e.g./  an  observational  measure  which  is  not 
discussed  as  an  assessment  of  the  behavioral  aspect  of  attitudes  is  included 
as  well  as  an  attitude  scale)  or  which  were  not  the  aim  of  the  treatment 
(e.g./  the  purpose  was  to  change  authoritarian  attitudes  toward  the  mentally 
ill/  as  assessed  by  the  OMI  Authoritarianism  Factor/  but  information  is  also 
given  incidentally  for  other  OMI  factors/  including  the  Unsophisticated 
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Benevolence  factor),  do  not  include  them*  As  noted  in  the  STATEMENT  OF 
GENERAL  PURPOSE  AND  POPULATIONS/  measures  such  as  sociometric  scales/ 
friendship  choices/  or  positive  interactions  are  relevant  only  if  the 
researchers  consider  them  to  be  assessments  of  attitudes. 

7.    Primary  ES  for  Solomon  4-group  designs.     With  a  Solomon  4«group  design/ 
compute  a       for  the  pretest-posttest  part  of  the  design  (using  raw  gains) 
and  a        for  the  posttest-only  part  of  the  design  and  pool  the  two  D's 
(weighting  by  n  if  the  design  is  not  balanced)* 
With  equal  n_*s: 


With  unequal  n*s: 


'^a  +  "b 

If  pretest  means  are  not  reported  for  the  pretest-posttest  part  of  the 
design/  pool  the  means  and  standard  deviations  for  the  two  design  parts  and 
then  compute  a  D. 
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8.  ESs  with  the  OMI>  Unless  the  authors  indicate  they  scored  their  data 
otherwise/  high  scores  on  both  the  OMI  Authoritarianism  (A)  and 
Unsophisticated  Benevolence  (B)  Scales  are  "bad"  and  low  scores  "good", 

9»  Intra-coder  Agreement  Checks,  After  you  have  coded  30  studies  in 
sequence/  excluding  reliability  checks/  go  back  to  the  beginning  of  that 
sequence  and  select  the  first  article  with  four  or  fewer  effect  sizes  to 
recode  as  an  intra-coder  reliability  check. 
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10.  ATDP — Forms  A  and  B.  If  Forms  A  and  B  of  the  ATDP  are  used  for  the 
pretest  and  posttest,  and  pretest  (Form  A)  scores  are  used  as  the  covariate 
in  a  COVAR,  use  the  adjusted  posttest  means  (Form  B)  to  compute  ESs. 
However,  if  adjusted  means  are  not  reported,  or  if  an  ANOVA  or  t-test  is  used 
on  the  posttest  means,  compute  raw  gains  using  the  two  forms  (difference 
between  Form  A,  pretest,  mean  and  Form  B,  posttest,  mean,  or  vice  versa)  to 
use  in  computing  ESs. 

!!•  Studies  with  ES  information  missing.  Use  the  ES  Information  Missing 
CODING  INSTRUMENT  to  code  studies  that  lack  the  necessary  information  to 
compute  ESs  only  if  information  on  the  statistical  significance  of  results  is 
available.  If  the  study  is  relevant,  but  ESs  cannot  be  computed  and 
information  is  not  available  to  choose  between  "1"  and  "2"  in  category  F.l  on 
page  14  of  the  CODING  INSTRUMENT,  the  study  should  be  placed  on  the  "Lack  of 
Information"  discard  list  and  in  the  "Lack  of  Information"  discard  pile.  For 
example,  if  ESs  could  not  be  computed  and  oretest-post  gains  for  a  control 
and  an  experimental  group  were  tested  separately  for  statistical 
significance,  but  no  comparison  was  made  of  experimentai-control  posttest 
means,  the  study  should  not  be  coded. 

If  a  report  has  one  or  more  ESs  which  can  be  computed  and  one  or  more 
which  cannot,  code  the  two  sets  of  ESs  separately  using  the  appropriate 
CODING  INSTRUMENT  (i.e.,  the  ES  Information  Missing  instrument  for  the  second 
set).  Code  as  if  you  had  two  separate  studies.  For  example,  for  the  first 
set,  the  N  of  ESs  (A.6.a.  of  the  CODING  INSTRUMENT)  will  be  the  number  v;hich 
can  be  computed:  for  the  second  set,  the  number  for  which  information  is 
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missing.  Where  Report  ID#  is  recorded  on  coding  sheets,  enter  the  ID# 
without  an  asterisk  for  the  first  set  and  with  an  asterisk  for  the  second 
set,  Also/  change  the  Alphabetic  and  Numerical  lists  so  that  the  ID#  is 
followed  by  an  asterisk  within  parentheses/  to  indicate  that  the  study  has 
been  coded  using  both  instruments. 
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APPENDIX  D 
LISTS  OF  DISCARDED  REPORTS 

(1)  Irrelevant  Reports 

(2)  Reports  Lacking  Information 
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Discarded  Reports — Irrelevant 

Agness/  Phyllis  J.    (1980).    Unpublished  doctoral  dissertation,  Ball  State 
University. 

Alper,   Sandra  &  Retish,   Paul  M.     (1972).    Training  School  Bulletin;    69/  70- 
77.   — 

Aloia,    Gregory   F./    Beaver,    Robert   J.,    &   Pettus/    William   F.  (1978). 
American  Journal  of  MentaJ.  Deficiency/  82(6)/  573-579. 

Anderson/   Shirley  Seaton.     (1982).    Unpublished  doctoral  dissertation/  George 
Peabody  College  for  Teachers. 

Anthony/  William  A./  &  Cannon/  John,     n.969).     Rehabilitation  Counseling 
Bulletin/   12/  239-40. 

Appolone/  Carol;  Romeis/  James;  Gibson/  Patricia;  McLean/  William;  &  Howard/ 
George.     (1979).    Epilepsia/  127-132. 

Armstrong,   Barbara/  Johnson/  David  W./  &  Balow/  Bruce.    (1981).  Contemporary 
Educational  Psychology/  6^/  102-109. 

Armstrong-Hugg/  Rooin  Lee.  (1982).    Unpublished  doctoral  dissertation/  The 
American  University. 

Asher/   [^ancy  Weinberg.     (1973).     Rehabilitation  Psychology/   20(4)/  156^164. 

Babow/  I.  &  A.  Johtison.  (1969).  American  Journal  of  Learning  Disabilites  74, 
116-124.  — 

Baker/  Sheldon  R.     (1964).    Nursing  research/  13/  345-347. 

Ballard/   Maurine;  Gottlieb/  Jay;   Corman/   Louise;   &  Kaufman/   Martin  J. 
(1977).    Journal  of  Educational  Psychology/  09(5)/  605-611. 
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Reports  are  identified  by  the  last  name  or  names  of  authors  and 
by  year.    Full  citations  can  be  found  in  Appendix  E, 

Arbitrary  number  for  recordkeeping  purposes. 
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The  attitude  modification  technique  used  as  the  treatment: 


l=information 
2=direct  contact 
3=vicarious  experience 
4=positive  reinforcement 
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10=systematic  desensitization 
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The  effect  size,   defined  as  D  =    ,   or  an  estimate 
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115 
191 
55 
35 
35 
127 
127 
9 

161 
161 
161 
161 

213 
213 
120 
43 

la 

150 
5 
5 

269 
269 
272 
13 
15 
13 
15 
193 
226 
226 
226 
226 
226 
226 
22ti 
226 
326 
226 
226 
226 
226 
226 
202 
Q 
134 


1 
1 
7 
2 
1 
2 
1 
2 
1 
2 
3 
2 
4 
1 
2 
3 
4 
5 
1 
2 
7 
1 

3 
1 
1 
2 
1 
2 
1 
1 
2 
3 
4 
2 
1 
2 
3 
4 
3 
6 
7 

a 

9 
10 

1 1 

12 
13 
14 

2 

1 

1 


9 
1 
1 

15 
1 
1 
1 
2 
4 
4 
4 
4 
4 
3 
19 
19 
19 
19 
2C 
20 

5 
1 
1 
1 
1 
3 
3 
3 
7 
7 
7 
7 


19 
43 
43 
73 
40 
40 
55 
163 
250 
300 
275 
34 
47 
400 
147 
1J9 
147 
159 
63 
63 
90 
z2 
40 
43 
26 
16 
971 
971 
367 
13 
12 
13 
12 
32 
1 16 
1 16 
1 16 
116 
116 
1 16 
1 16 
116 
1 16 
1 16 
1 16 
116 
•  16 
1 16 
234 
41 
57 


9 
1 
1 
1 
1 
1 

15 
14 
1 
1 
1 

2 
2 
14 
1 
1 
1 
I 
2 
2 
1 
4 
14 
14 
9 

9 
13 
15 
15 
6 
6 
6 


13 


EDLEVEXP 

UNl^v'MAJE 

OCCUP^/P 

CQMINSTR 

ASSESTYP 

TRTECHE 

TQTHRSE 

ES 

0 

0 

3 

0 

3 

2 

120  0 

75 

2 

0 

0 

0 

3 

2 

3  7 

17 

2 

0 

0 

o 

2 

3  7 

-  01 

2 

0 

0 

0 

3 

?. 

7 

93 

3 

0 

0 

0 

3 

2 

7  5 

25 

3 

0 

0 

0 

3 

2 

7  5 

-  10 

3 

0 

0 

1 

3 

2 

45  0 

52 

3 

0 

0 

1 

3 

2 

26 

3 

0 

0 

0 

4 

2 

1.  5 

3  67 

3 

0 

0 

0 

4 

2 

1.  3 

04 

3 

0 

0 

0 

4 

2 

1.  5 

4  40 

3 

0 

0 

0 

3 

2 

23  0 

17 

3 

0 

0 

0 

17 

2 

23  0 

'  1.  05 

3 

0 

0 

0 

6 

2 

13 

3 

0 

0 

0 

3 

2 

900  0 

32 

3 

0 

0 

0 

3 

2 

900  0 

30 

1 

0 

0 

0 

3 

2 

900  0 

49 

3 

0 

0 

0 

3 

2 

900  0 

-  04 

3 

0 

0 

0 

9 

2 

10  3 

95 

3 

0 

0 

0 

6 

2 

10  3 

34 

5 

0 

0 

1 

3 

2 

1  0 

02 

i> 

0 

0 

0 

9 

2 

360  0 

1  03 

7 

0 

0 

0 

4 

2 

42 

7 

0 

0 

0 

6 

2 

-  47 

7 

0 

3 

I 

3 

2 

53 

7 

0 

I 

3 

2 

49 

7 

0 

0 

Q 

2 

-  01 

7 

0 

0 

0 

9 

2 

05 

7 

1 

0 

0 

3 

2 

.  55 

7 

1? 

0 

0 

9 

2 

13 

7 

13 

0 

0 

9 

2 

10 

7 

13 

0 

0 

9 

2 

43 

7 

13 

0 

0 

9 

2 

49 

7 

16 

0 

0 

3 

2 

1  2 

-  12 

a 

16 

0 

0 

9 

2 

4  0 

33 

a 

16 

0 

0 

9 

2 

4  0 

52 

9 

16 

0 

0 

9 

2 

4  0 

74 

a 

16 

0 

0 

9 

2 

4  0 

34 

9 

16 

0 

0 

9 

2 

4  0 

10 

s 

16 

0 

0 

9 

2 

4  0 

-  09 

9 

16 

0 

0 

9 

2 

4  0 

23 

9 

1^ 

0 

0 

6 

2 

4  J 

14 

9 

16 

0 

0 

6 

2 

4  0 

12 

9 

16 

0 

0 

6 

2 

4  0 

44 

8 

16 

0 

0 

6 

2 

4  0 

73 

9 

16 

0 

0 

6 

2 

4  0 

-  16 

9 

16 

0 

0 

6 

2 

4  0 

-  17 

9 

16 

0 

0 

6 

2 

4  0 

2t 

9 

1 

0 

0 

3 

2 

~  09 

9 

1 

0 

1 

3 

2 

32  S 

04 

9 

16 

0 

1 

3 

2 

59 

ERIC 


563 


AU  THOR 

YEAR 

ID# 

ESID4* 

ATTOgARD 

TOTN 

SETTIKGE 

UNIVMAJE 

Petrangelo 

1976 

134 

2 

155 

1 

Q 

16 

Rowe,  i  Stutts 

m  press 

121 

3 

40 

2 

g 

16 

121 

40 

1 1 

3 

121 

2 

35 

1 1 

g 

16 

Urie,  &  Smith 

1970-71 

202 

I 

?54 

13 

g 

Hamilton,  &  Anderson 

1983 

16 

1 

120 

14 

g 

Powe,  &  Stutts 

in  press 

121 

4 

I 

60 

14 

g 

1  A 

Evams 

1976 

36 

2 

2 

40 

12 

g 

« 

1 

1  T 
1  «J 

Wute 

1973 

11 

I 

3 

64 

2 

g 

11 

2 

3 

36 

2 

g 

1  T 
1  «J 

11 

3 

3 

33 

2 

1  T 

Landis 

1981 

139 

t 

3 

94 

2 

g 

1  T 

Strauch,  et  ai. 

1970 

138 

I 

3 

10 

4 

g 

138 

2 

3 

10 

4 

8 

} 

^ite 

1973 

1 1 

4 

3 

66 

4 

8 

13 

Holzberg,  &  Gewirtz 

1963 

38 

1 

6 

59 

6 

g 

KoUk,  et  al. 

1969 

74 

2 

6 

318 

g 

74 

3 

6 

318 

6 

g 

: 

Scheibe 

1965 

123 

I 

6 

99 

15 

g 

123 

2 

6 

99 

15 

g 

Spiegel,  et  al. 

1968 

70 

1 

6 

40 

6 

Q 

70 

2 

6 

40 

6 

Q 

Kulik,  et  al. 

1969 

74 

1 

6 

318 

6 

Q 

Smith 

1969 

133 

1 

6 

136 

6 

Q 

133 

2 

6 

136 

6 

Q 

Evans 

1976 

36 

\ 

2 

40 

a 
a 

1 

Kaoffinan 

1976 

137 

3 

4 

24 

2 

10 

Q 

137 

6 

4 

24 

2 

10 

Q 

137 

9 

4 

24 

1 0 

U 

157 

12 

4 

24 

1 0 

Q 

Colasuonno 

1981 

183 

1 

9 

32 

0 

1 0 

Q 

Vteaver 

1932 

U-l 

I 

19 

48 

1 

1 0 

Q 

Klein 

1969 

79 

1 

3 

48 

14 

J  J 

0 

Smith 

1969 

K,3 

3 

6 

134 

1  1 

U 

;13 

4 

6 

134 

6 

1  1 

Q 

Granofsky 

1956 

*04 

I 

19 

135 

^ 

1  I 

U 

104 

2 

19 

13*5 

6 

\  \ 

Q 

Hicks,  &  Spaner 

1952 

37 

1 

6 

78 

6 

1 2 

Q 

37 

2 

6 

334 

6 

12 

0 

37 

3 

6 

334 

6 

12 

0 

Beards ley 

W9 

40 

1 

110 

2 

0 

40 

2 

130 

2 

0 

Lipsky 

1978 

10 

3 

60 

3 

0 

Dahi,  et  al. 

1978 

26 

6 

8'" 

3 

0 

Paxton 

1983 

90 

2 

97 

3 

0 

90 

3 

97 

3 

0 

Marcxis 

1979 

264 

1 

396 

3 

0 

Marquart 

1S34 

35 

3 

47 

2 

3 

0 

33 

7 

47 

2 

3 

0 

33 

1  1 

47 

2 

3 

0 

OM  7  ►  iC  "*'ri 

Abbb'^  1  Yf* 

IRTECHE 

TQTHRSE 

ES 

: 

«■> 

2 

26 

3 

2 

24  0 

1  ai 

3 

2 

24  0 

2  29 

3 

2 

24  0 

1  88 

Q 

■ 

3 

^ 

1  5 

Q 

3 

2 

2  3 

30 

u 

3 

2 

24  0 

1  00 

u 

3 

2 

13 

u 

0 

3 

2 

00 

u 

0 

3 

2 

-  09 

u 

u 

3 

2 

00 

0 

0 

9 

2 

120  0 

55 

rt 
u 

0 

9 

180  0 

26 

0 

0 

9 

2 

180  0 

31 

0 

0 

3 

2 

71 

3 

2 

60  0 

I  24 

f\ 

u 

0 

IS 

2 

360  0 

23 

u 

0 

1 S 

2 

360  0 

13 

u 

0 

1  5 

2 

320  0 

12 

u 

0 

1  5 

2 

320  0 

35 

'J 

2 

3 

2 

57 

u 

2 

3 

2 

64 

Q 

3 

^ 

360  0 

30 

Q 

3 

12 

Q 

2 

3 

2 

1 1 

0 

1 

3 

2 

.  7 

.  63 

u 

3 

2 

7 

-  32 

u 

2 

7 

-  33 

0 

3 

2 

7 

-  44 

0 

3 

2 

7 

13 

u 

3 

2 

1  2 

17 

0 

3 

900  0 

74 

1  1 

0 

3 

2 

13 

^ 

3 

2 

999  9 

23 

2 

3 

2 

999  9 

28 

1 

0 

1 3 

2 

8  0 

-  16 

J 

u 

1 4 

2 

8  0 

40 

Q 

0 

3 

2 

420  0 

Z  04 

0 

0 

3 

2 

420  0 

91 

0 

0 

3 

2 

420  0 

90 

0 

0 

3 

2  9 

00 

0 

0 

3 

3 

2  9 

10 

0 

0 

3 

3 

5 

78 

0 

0 

14 

3 

7 

-  24 

0 

0 

13 

3 

8 

60 

0 

0 

15 

3 

8 

-  14 

0 

0 

4 

3 

23  0 

-  09 

0 

0 

3 

3 

4  2 

34 

0 

0 

13 

3 

4  2 

-  10 

0 

0 

6 

3 

4  2 

-  24 

<5 


AU  THCR 

Lipsky 
Oahl/et  al. 

Oahl,  et  al. 
Westerveit 

Dahl,  et  al. 


Cerreto 

^JartlI^e2 

Ibrahim 

Peterson 
Wilson,  &  Alcor 
Ibrahim 

Avery,,  &  Davis 


Brady 

Levinson 

Danpier 

Simon 


Dye 


Sawyer,  &  Clark 


Tt^  .iiburg 

Steen 

Rothschild 

Euse 

Forader 


YEAS 

ion 

£SID# 

ATTQUARD 

TOTN 

3ETTINGE 

tOLE^ 

1978 

10 

1 

1 

60 

I 

3 

1978 

26 

5 

1 

a9 

1 

3 

*J8J 

206 

1 

2 

44 

0 

3 

1S78 

26 

3 

2 

a9 

1 

3 

1978 

144 

1 

2 

46 

2 

3 

144 

2 

2 

4^ 

2 

3 

19/8 

26 

4 

3 

a9 

1 

3 

26 

I 

a 

a9 

1 

3 

26 

2 

9 

39 

1 

3 

1976 

190 

1 

1 

34 

1 

7 

190 

2 

1 

34 

1 

7 

1977 

223 

4 

1 

30 

0 

g 

223 

5 

1 

30 

0 

B 

1979 

204 

3 

30 

13 

g 

204 

7 

30 

13 

3 

204 

9 

30 

13 

g 

1977 

223 

1 

30 

0 

g 

223 

2 

30 

0 

g 

1977 

136 

3 

43 

1 

Q 

1969 

146 

1 

30 

13 

g 

1979 

204 

1 

30 

13 

g 

204 

3 

I 

30 

13 

9 

198J 

222 

1 

2 

'  2 

1 

9 

222 

2 

2 

32 

1 

s 

222 

3 

2 

16 

1 

a 

1966 

22 

1 

a 

70 

7 

8 

1973 

6^ 

1 

a 

IB 

14 

1982 

246 

1 

9 

30 

1 

g 

246 

2 

9 

20 

1 

g 

1970 

46 

1 

15 

211 

1 

g 

46 

2 

13 

21  1 

1 

g 

46 

3 

13 

211 

1 

8 

46 

4 

13 

21  1 

1 

g 

46 

3 

13 

21  1 

1 

g 

46 

6 

13 

211 

1 

g 

46 

7 

13 

21  1 

1 

g 

1978 

34 

2 

6  . 

21 

12 

9 

34 

4 

6 

21 

12 

9 

34 

6 

6 

21 

12 

9 

34 

a 

6 

21 

12 

9 

1980 

153 

1 

20 

33 

13 

9 

153 

2 

20 

33 

13 

9 

13- 

3 

20 

33 

13 

9 

153 

4 

20 

38 

13 

9 

1983 

235 

1 

1 

40 

0 

10 

235 

2 

1 

40 

0 

10 

1980 

39 

1 

1 

30 

0 

10 

1978 

13° 

I 

1 

34 

1 

10 

1975 

124 

I 

20 

0 

3 

124 

2 

2 

20 

0 

a 

1969 

94 

1 

2 

63 

1 

6 

P  UniVnAJE  OCCUPEXP  CGhlNSTrl  ASSESTYP  TRTECHE  TQTHRSE 


0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
5 
5 
14 
14 
14 
5 
5 
^3 
5 
14 
14 
16 
16 
16 
i 

16 


6 
6 
6 
6 
16 
16 
16 
16 
0 
0 
0 
0 
16 
16 
0 


0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 

o 

0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
5 
5 
15 
5 
0 

o 

0 


1 
1 

0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


3 
0 
0 
0 
0 
0 
0 
0 
0 
0 
2 
2 
2 
2 
0 
0 
0 
0 
0 
0 


3 
15 


6 
17 
17 
9 
9 
3 
3 
1 1 
3 
3 
3 
3 
3 
3 
3 
3 
3 

3 
9 
9 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
9 
9 
9 
9 

1 1 

1 1 
3 
3 
3 
3 
3 


3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

3 

4 

4 

3 


5 
7 

1  0 

7 
2 
2 
7 
7 
7 
7 
7 
2 

2 

5  0 
5  0 
3  0 
2 


24 
24 
24 
24 
23 
25 
25 
25 
1 
1 

4 
20 
6 
6 


ES 

1  26 
09 
05 
42 
47 
06 
30 
23 
23 
93 

-  35 
-1.  42 
-1.  28 

1.  39 
1.  97 
67 

-  42 

-  62 

-  13 
01 
91 

1  66 
34 
1  02 
1.  40 
31 
48 
3  11 
1  07 

-  01 
01 

-  24 

-  05 
03 

-  12 
32 
34 

-  02 
44 
67 
12 

1  31 
7X 
1.  03 
I  23 
1  23 
19 

-  70 
I  75 
1  73 

40 


ERIC 


565 


AU  ThQR 

ID** 

^3ID#  A' 

1  u  1  r< 

311  Mil 

roraaer 

1969 

94 

n 

2 

72 

94 

3 

n 

72 

94 

1. 

2 

63 

94 

5 

2 

64 

94 

2 

62 

Gottlieb 

1930 

47 

1 

3 

26 

47 

2 

3 

26 

47 

3 

3 

26 

Feidrran,  &  Feidman 

1985 

1  4 

1 

60 

1  T 

14 

2 

I 

60 

1  T 
1  u 

]  4 

3 

1 

60 

1 3 

Donaldson 

1974 

31 

1 

2 

49 

11 

2 

2 

48 

3 1 

3 

2 

55 

Morrison,  et  al. 

1978 

100 

I 

^ 

38 

100 

2 

TO 
JO 

Dye 

1978 

34 

6 

1 2 

34 

3 

20 

34 

5 

20 

1 2 

34 

7 

20 

1 2 

Grove 

1978 

1 73 

1 5 

173 

2 

88 

1 5 

Gerstem 

1976 

260 

1 

90 

■ 

Sanders 

1978 

207 

1 

40 

* 

207 

2 

207 

3 

207 

4 

40 

207 

5 

J 

207 

^ 

40 

J 

207 

7 

39 

207 

8 

33 

207 

9 

40 

2C7 

10 

38 

: 

Simpson,  et  al. 

1976 

2 

1 

38 

Rae 

1983 

108 

1 

TA 

108 

g 

g 

TA 

1  4 

Cronk 

1978 

1 86 

2 

5 

39 

1 86 

3 

g 

40 

1  5 

S^^so 

1983 

193 

1 

7 

1 5 

Vinish 

1974 

23 

1 

60 

WUrzel 

1980 

239 

I 

47 

239 

2 

47 

: 

239 

7 

24 

239 

8 

24 

J 

Marquart 

1984 

35 

2 

44 

2 

35 

6 

45 

2 

35 

10 

44 

2 

Mulj^ey 

1980 

88 

3 

34 

1 

Peurrish 

1974 

270 

1 

28 

1 

270 

2 

28 

1 

Lazar,,  et  al. 

1971 

76 

I 

44 

14 

A*  uriivnAJt, 

or"     1  IDC  •  □ 

AS5E5TVP 

TR TEChE 

TQTHR5E 

£S 

0 

0 

I 

3 

5 

43 

u 

U 

3 

5 

3 

34 

u 

0 

1 

3 

5 

3 

25 

0 

0 

1 

3 

5 

3 

47 

0 

0 

1 

3 

5 

3 

19 

7 

U 

0 

0 

1  5 

5 

2 

2  42 

7 

0 

0 

0 

1  5 

5 

2 

1  78 

7 

U 

0 

0 

1  5 

5 

93 

g 

0 

1 

3 

5 

2  5 

65 

g 

5 

0 

1 

3 

5 

2  5 

71 

g 

0 

* 

3 

5 

2  5 

75 

g 

0 

1 

3 

- 

8 

1  32 

g 

0 

1 

3 

5 

8 

66 

g 

J 

0 

1 

3 

5 

8 

39 

g 

0 

0 

9 

5 

2  J 

37 

8 

0 

0 

9 

5 

2  0 

45 

9 

6 

0 

2 

3 

5 

24  0 

70 

6 

0 

- 

3 

5 

24  0 

30 

o 

T 

6 

0 

2 

3 

5 

24  0 

-  36 

9 

6 

0 

2 

3 

5 

24  0 

74 

1  1 

0 

1 

0 

3 

5 

4  0 

69 

1  1 

0 

1 

0 

3 

4  0 

94 

•J 

5 

0 

1 

3 

6 

1 

-  08 

a 
a 

5 

0 

3 

1  2 

03 

8 

5 

0 

i 

3 

6 

1  2 

62 

a 
a 

5 

0 

3 

6 

5 

26 
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1  5 

5 

Q 

60 

1 

20 

140 

1 

0 

60 

2 

20 

1  34 

1 

Q 

80 

1 

29 

1 1 

g 

J 

ao 

2 

28 

1 1 

g 

J 

30 

3 

30 

1 1 

g 

J 

80 

4 

30 

1 1 

g 

80 

5 

30 

1 1 

9 

1 

80 

6 

30 

1 1 

g 

J 

220 

4 

2 

33 

15 

g 

0 

d20 

7 

2 

33 

1  5 

g 

220 

10 

2 

33 

1  5 

g 

Q 

220 

13 

2 

33 

15 

8 

Q 

220 

1 

2 

33 

13 

g 

Q 

231 

1 

3 

295 

1 

g 

231 

2 

3 

29 1 

1 

g 

62 

1 

6 

27 

1 S 

g 

62 

4 

6 

27 

IS 

g 

62 

7 

6 

27 

13 

g 

62 

10 

6 

27 

13 

g 

103 

1 

9 

33 

12 

g 

2 

103 

2 

9 

33 

12 

g 

2 

103 

3 

9 

33 

12 

g 

2 

103 

4 

9 

33 

12 

8 

2 

103 

3 

9 

33 

12 

8 

2 

103 

6 

9 

33 

12 

8 

2 

103 

7 

9 

16 

12 

8 

2 

103 

8 

9 

13 

12 

8 

2 

103 

9 

9 

13 

12 

8 

2 

77 

20 

1 

9 

13 

4 

187 

13 

10 

0 

178 

117 

0 

10 

0 

33 

41 

1 

10 

0 

19 

27 

12 

10 

0 

142 

4 

37 

10 

0 

0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 

15 

15 

15 
5 
3 


TQTHRSE 

ES 

9 

3  0 

37 

3 

9 

6  0 

63 

9 

3  0 

49 

3 

o 

T 

16 

3 

9 

37 

Q 

9 

A.  0 

-  10 

0 

9 

3  0 

62 

Q 

9 

6  0 

-  02 

0 

3 

o 

3  0 

32 

Q 

9 

2  7 

72 

0 

3 

9 

2  7 

10 

0 

9 

2  7 

49 

Q 

9 

2  7 

-  06 

J 

3 

9 

t  7 

62 

J 

3 

9 

5.  I 

58 

1 

3 

9 

4  1 

59 

3 

9 

2  7 

97 

u 

6 

9 

8.  3 

76 

0 

6 

9 

8  3 

74 

J 

3 

9 

64 

1 

3 

9 

1.  09 

1 

3 

9 

.  64 

1 

3 

9 

33 

1 

3 

9 

41 

1 

3 

9 

87 

u 

6 

9 

3.  3 

.  45 

u 

9 

9 

5  3 

.  47 

u 

9 

9 

5  3 

71 

u 

9 

9 

3.  3 

44 

1 

3 

9 

5  3 

L  68 

O 

3 

9 

6 

2  26 

o 

3 

9 

6 

2.  00 

o 

9 

9 

3i 

A 
\J 

o 

9 

-  13 

« 

3 

9 

81 

C 

9 

17 

U 

9 

1.  0 

-  44 

A 

3 

9 

1  0 

-  44 

A 

3 

9 

1  0 

-  33 

0 

V 

1  0 

-  47 

0 

3 

9 

1  0 

-  44 

0 

3 

9 

1  0 

-  37 

0 

3 

9 

:  0 

-  63 

0 

3 

9 

1  0 

-  14 

0 

3 

9 

t  0 

-1  22 

1 

3 

9 

44  0 

1  12 

0 

3 

9 

-  10 

1 

3 

9 

7  0 

23 

1 

3 

9 

5  7 

86 

3 

6 

9 

360  0 

76 

0 

3 

9 

24  0 

67 

570 


ERIC 


AU  ThOK 

YEAR 

[0# 

ATTOUIARD  TOTN 

5ETTINGE 

EDLEVEXP 

ONIVMAJE 

OCCUPEXP 

COMINSTR 

Soyster 

1981 

6 

5 

80 

0 

1 1 

0 

15 

0 

6 

7 

80 

0 

0 

15 

0 

6 

9 

83 

*  0 

0 

13 

0 

6 

1 1 

81 

0 

0 

15 

0 

6 

2 

au 

0 

0 

15 

1 

loo*) 

106 

1 

Ha 

13 

1 1 

0 

15 

1 

106 

2 

25 

15 

1 1 

0 

15 

1 

Bremtlinger 

1978 

208 

1 

3 

39 

14 

1 1 

0 

15 

0 

208 

3 

3 

39 

14 

1 1 

0 

IS 

0 

208 

2 

3 

30 

13 

1 1 

0 

4 

0 

208 

4 

3 

30 

13 

1 1 

0 

4 

0 

Friedel 

IdSO 

188 

1 

20 

1 34 

13 

12 

3 

0 

3 

1978 

224 

3 

20 

40 

1 

7 

13 

0 

0 

224 

3 

ro 

40 

1 

7 

13 

0 

0 

224 

4 

&o 

IS 

7 

13 

0 

0 

224 

6 

40 

15 

7 

13 

0 

0 

224 

1 

20 

1 

7 

13 

0 

1 

224 

2 

20 

AO 

13 

7 

13 

0 

1 

Haddle 

1973 

69 

1 

a  / 

11 

8 

1 

0 

1 

Cohen 

1Q73 

220 

2 

37 

11 

8 

0 

0 

0 

220 

8 

2 

37 

11 

8 

0 

0 

0 

220 

11 

2 

37 

11 

8 

0 

0 

0 

220 

14 

2 

37 

1 1 

8 

0 

0 

0 

220 

2 

2 

37 

1 1 

8 

0 

0 

1 

Frazier 

1975 

62 

2 

6 

26 

1 1 

8 

0 

0 

62 

6 

26 

1 1 

8 

0 

0 

62 

a 

6 

26 

It 

8 

0 

2 

62 

1 1 

6 

26 

1 1 

8 

0 

2 

Haddle 

1973 

69 

2 

8 

87 

11 

8 

0 

0 

69 

3 

8 

87 

1 1 

8 

0 

0 

69 

4 

8 

87 

1 1 

8 

0 

0 

69 

3 

8 

87 

11 

8 

0 

0 

69 

6 

8 

87 

11 

8 

0 

0 

NUMBER  OF  CASES 

READ  =» 

6^3 

NUMBER  OF 

CASES  LISTED  - 

643 

17 
17 
17 
17 
3 
3 
3 
3 
3 
3 
3 
6 
3 
9 
3 
9 
3 
3 
3 
6 
9 
9 
9 
3 
9 
9 
3 
3 
3 
3 
3 
3 
9 


9 
9 
9 
9 
9 
9 
9 
9 
9 
9 
9 
9 
'10 
10 
10 
10 
10 
10 
10 
10 
10 
10 
10 
10 
10 
10 
iO 
10 
10 
10 
10 
10 
10 


8.  0 


3 
0 
0 
0 
0 
0 

c 

0 
3.  0 
3.  0 
3.  3 
3.  3 
3.  3 
3.  3 
3.  3 


E5 

21 
14 

07 

-  12 
28 
37 
30 
51 
20 
89 
38 

-  39 
56 

1.  07 
63 
38 
62 
30 
13 
1 1 
12 
17 

-  16 
1  26 

-  06 

-  16 
51 

-  69 
27 
32 
32 
30 
02 


571 


ERIC 


TREATMENT  A  VmSUS  TREATMENT  B  EFFECT  SIZES 

KEY 


The  key  for  these  effect  sizes  is  the  same  as  for  the  T  vs. 
T  vs.  ?,  and  Pre~post  effect  sizes^  except: 


TRTECHE  is  Treatment  A^  and 
TRTECHC  is  Treatment  B. 


572 

ERIC  561 


TREATMENT  A  VERSUS  TREATMENT  B  EFFECT  SIZES 


AU  ThOR 

YEAR 

ESIDtt 

ATTOWAR^ 

TOTN 

SETTINGS 

EDLEVEXP 

UNI^^MAJE 

OCCUPEXP 

CGMINSTR 

ASSEST.'P 

TRTECHE 

TRTECHC 

TOTHRSE 

E  3 

Dresang 

1981 

52 

1 

76 

3 

0 

0 

0 

4 

1 

2 

7 

05 

52 

4 

1  14 

3 

0 

0 

0 

9 

1 

c 

7 

-  1  2 

MuUcey 

1980 

da 

1 

32 

3 

0 

0 

1 

3 

1 

a 

3 

-  2^ 

89 

2 

SI 

S 

0 

0 

1 

3 

1 

8 

3 

-  4o 

GQ 

3 

47 

6 

0 

0 

1 

3 

1 

a 

3 

55 

Qberle 

1975 

200 

3 

I 

80 

7 

16 

0 

1 

3 

1 

2 

1 

3  ' 

Jamuary 

1978 

165 

3 

63 

8 

0 

0 

9 

2 

2 

-  10 

165 

4 

613 

a 

0 

0 

9 

1 

2 

2 

lo 

Dealey 

1978 

194 

2 

52 

8 

1 

0 

0 

3 

1 

2 

47  S 

-  06 

January 

1978 

1^5 

1 

I 

63 

8 

0 

1 

3 

1 

2 

2 

05 

165 

2 

63 

8 

1 

0 

1 

3 

1 

2 

2 

10 

Deu-ley 

1978 

184 

1 

1 

52 

8 

1 

0 

1 

3 

1 

2 

47  S 

1.  1  2 

Noe 

1902 

140 

1 

1 

74 

8 

4 

0 

4 

3 

1 

2 

28  3 

-  20 

140 

2 

59 

8 

4 

0 

4 

3 

1 

8 

28  3 

-  55 

140 

3 

37 

8 

4 

0 

4 

3 

1 

a 

28  3 

-3'; 

1  40 

4 

74 

8 

4 

0 

4 

3 

2 

28  3 

-  1^ 

1 

1  4U 

59 

8 

4 

0 

4 

3 

3 

28  3 

"  9o 

1 40 

6 

57 

3 

4 

0 

4 

3 

S 

28  3 

-  64 

1 40 

7 

74 

8 

4 

0 

4 

3 

2 

28  3 

-  25 

1  40 

a 
a 

* 

59 

8 

4 

0 

4 

3 

8 

28  3 

-  53 

1 40 

o 

1 

!57 

J 

8 

4 

0 

4 

3 

8 

28  3 

-  33 

\XL(3HOlH 

jjOo 

AO 

1 

6 

63 

8 

1 

0 

2 

3 

2 

40  0 

42 

4a 

2 

6 

63 

J 

B 

1 

0 

2 

3 

2 

40  0 

41 

1978 

165 

1 4 

63 

8 

1 

0 

0 

9 

2 

2 

09 

6 

1 4 

63 

J 

a 

1 

0 

0 

9 

2 

27 

1985 

c^oa 

1  o 

9 

1  6 

0 

0 

3 

2 

1.  0 

23 

1983 

229 

1 

J" 

3* 

/\ 
u 

1 2 

1  6 

1  5 

1 

3 

2 

.  7 

QO 

2 

2^3 

Q 

1  D 

1  6 

1  *^ 

1 

3 

2 

7 

20 

Qnerton  &  Hothnvm 

1975 

151 

2 

52 

-J 

p 

o 

r\ 
\J 

0 

1 

2 

2 

- 

.  02 

Maclntyre 

1981 

163 

1 

31 

14 

3 

0 

0  " 

0 

3 

2 

3 

S 

415 

163 

2 

1 

47 

14 

3 

0 

0 

0 

3 

3 

3 

69 

163 

3 

48 

14 

3 

0 

0 

0 

3 

2 

3 

3 

23 

Rynders 

1580 

212 

3 

12 

12 

5 

0 

0 

0 

16 

2 

3 

8  0 

84 

Cberle 

1975 

200 

1 

1 

80 

1  1 

7 

16 

0 

1 

3 

2 

2 

-  ^3 

200 

2 

1 

80 

1  1 

7 

16 

0 

1 

3 

2 

2 

-  04 

Cautela 

1971 

221 

1 

42 

13 

8 

1 

0 

0 

3 

4 

1 1 

1  2cj 

Cberle 

1975 

200 

4 

1 

120 

IS 

7 

16 

0 

1 

3 

7 

a 

-  30 

Fisher 

1975 

42 

1 

50 

2 

a 

13 

0 

0 

3 

7 

8 

1 

42 

2 

30 

2 

8 

13 

0 

0 

3 

7 

a 

-  1  i 

42 

3 

30 

2 

8 

13 

0 

0 

3 

7 

8 

-  10 

Dyer 

1970 

56 

1 

67 

13 

8 

16 

0 

0 

3 

7 

2 

la 

56 

2 

67 

13 

8 

16 

0 

0 

6 

7 

2 

44 

36 

3 

67 

13 

8 

16 

0 

0 

6 

7 

49 

Royster 

1981 

6 

3 

55 

0 

1  1 

0 

15 

0 

17 

8 

10 

7  S 

29 

6 

12 

85 

0 

1  1 

0 

1 0 

17 

& 

10 

7  5 

3 : 

6 

13 

80 

0 

1  1 

0 

lb 

0 

17 

8 

10 

7  S 

3a 

Rynders 

1980 

212 

1 

12 

12 

3 

0 

A 

0 

1  6 

9 

3 

8.  0 

2  7i 

212 

2 

12 

12 

5 

0 

0 

0 

16 

9 

3 

8  0 

3  55 

Hazzard 

1981 

4S 

6 

130 

1 

7 

0 

0 

0 

3 

9 

9 

Z  0 

-  03 

45 

7 

150 

1 

7 

0 

0 

0 

3 

9 

9 

5  0 

00 

45 

8 

150 

1 

7 

0 

0 

0 

6 

9 

9 

5  0 

-  2l 

43 

9 

130 

1 

7 

0 

0 

0 

6 

9 

9 

5  0 

00 

J 


,  '^73 


574 


AUTHOR 


YEAR 


IDtl  ESIDI*  ATTQWARD     TOTN  SETTINGE  ^DLEVEXP  UNlVMAJE  OCCUPEXP  CanlNSTR  ASSESTYP    TRTECHE  TRTECHC  rQTHRL>E 


Newman 

1978 

199 

1 

199 

2 

199 

4 

199 

3 

Rodriguez 

1978 

97 

2 

97 

3 

97 

1 

Meyer 

1963 

181 

1 

181 

2 

NUMBER  OF 

CASES  READ 

3 

61 

I 


I 
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ERIC  ' 


7 

205 

15 

7 

0 

0 

7 

205 

15 

7 

0 

0 

7 

205 

15 

7 

0 

0 

7 

205 

15 

7 

0 

0 

71 

1 

8 

16 

0 

80 

1 

8 

16 

0 

71 

1 

8 

16 

0 

34 

15 

8 

6 

0 

24 

15 

8 

6 

0 

NUMBER  OF  CASES  LISTED  ^  61 


•  t  •  t 


0 

a 

9 

9 

4 

5 

-1  07 

0 

a 

9 

9 

4 

5 

iZ 

0 

9 

9 

9 

4 

5 

20 

1 

3 

9 

9 

4 

5 

-  OP. 

0 

6 

9 

7 

2 

0 

-  05 

0 

17 

9 

7 

2 

0 

-  43 

1 

3 

9 

7 

2 

0 

-  13 

1 

3 

9 

2 

10 

0 

4-^ 

I 

3 

9 

10 

0 

27 

576 


) 

I 

•  •  #  • 


MAINSTREAMING  EFFECT  SIZES 
KEY 


The  key  for  those  effect  sizes  is  the  same  as  for  the  T  vs.  C,  T  vs.  Pr  and 
Pre-post  effect  sizes  except: 


Column 


^-:lNSTRC 


MSDISCHD 


MSPEERS 


Codes 


The  type  of  instruction  in  the  mainstreamed  classrooms: 

0=can*t  tell  4=peer  tutoring 

l=standard  group/class  instruction  5=combination 

2=cooperative  learning  6=other 
3=individualized  instruction 

The  disabilities  of  the  mainstreamed  children: 


0=can*t  tell 

l=mildly  and  moderately  retarded 
2=emotionally  disturbed 
3=visually  impaired 
4=blind 

5=hearing  impaired 
6=deaf 


7=communication  disordered 
8=physically  and  health 

impaired 
9=learning  disabled 

lC==combination 

ll=others 


Any  special  instruction  for  nondisabled  oeers  in  the 
mainstreamed  classroom: 


0=can*t  tell 
l=none 

2=information 


3=vicarious  experience 
4=reinforcement 
5=persuasive  messages 


MSMONTHS  The  number  of  months   the  experimental   (nondisabled)  Ss 

participated  in  mainstreaming.  If  no  number  is  entered,  the 
information  was  not  available. 


ERLC 


577 


565 
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MAINSTREAMING  EFFECT  SIZES 


AUTHOR 

YEAR 

ID# 

ESID# 

TOTN 

TRTCONXT  EDLEVEXP 

MSINSTRC 

MSDISCHD 

MSPEERS 

cqminstr 

ASSESTYP 

MSMONTHS 

ES 

Sigler  et  etL. 

1978 

169 

1 

180 

1  3 

0 

0 

0 

4 

3 

10 

Brighi 

1978 

24 

2 

96 

1  3 

0 

8 

0 

1 

3 

.  39 

24 

3 

83 

1  3 

0 

8 

0 

1 

3 

.  49 

24 

4 

69 

1  3 

0 

8 

0 

1 

3 

,  10 

Vbeltz 

1982 

271 

1 

520 

1  3 

0 

10 

2 

0 

3 

17 

02 

271 

2 

537 

1  3 

0 

10 

1 

0 

3 

13 

-  16 

271 

7 

520 

1  3 

0 

10 

2 

0 

3 

17 

19 

271 

8 

537 

1  3 

0 

10 

1 

0 

3 

13 

-  07 

t^pier  et  al. 

1972 

166 

1 

148 

1  3 

1 

a 

0 

9 

9 

31 

Smith  &  Larson 

1980 

169 

1 

154 

1  S 

0 

10 

1 

1 

3 

18 

72 

Straiich 

1970 

i70 

1 

124 

1  5 

1 

1 

1 

0 

9 

OS 

170 

2 

124 

1  S 

1 

1 

1 

0 

9 

24 

irighi 

1978 

24 

1 

542 

1  7 

0 

a 

0 

3 

43 

Voeltz 

1930 

187 

1 

856 

1  7 

0 

10 

2 

3 

8 

1  OS 

187 

2 

877 

1  7 

0 

10 

1 

0 

3 

6 

.  41 

187 

3 

856 

1  7 

0 

10 

2 

0 

3 

a 

.  36 

187 

4 

877 

I  7 

0 

10 

1 

0 

3 

6 

-  01 

Hoseley 

1973 

135 

1 

80 

1  7 

1 

0 

1 

1 

3 

9 

.  16 

135 

2 

80 

1  7 

1 

0 

1 

1 

3 

9 

81 

Winkler 

1981 

54 

1 

114 

1  10 

0 

1 1 

1 

0 

1  1 

.  07 

NUMDER  OF  CASES 

READ  ^ 

20 

NUMBER  OF 

CASES  LISTED  = 

20 

578 


ERIC 


MISSING  INFORMATION  ElESULTS 
(No  Treatment  A  vs.  B  results) 

KEY 


The  key  for  these  effect  sizes  is  the  same  as  for  the  T  vs. 
T  vs.  ?,  and  Pre-post  effect  sizes^  except: 

Instead  of  an  ES  column^  there  is  an  ESAVAIL  (Effect  Size 
Available)  column^  for  which  the  codes  are: 

l=No^  can't  tell  direction  of  result 
2=No^  negative  result 
3=NOf  positive  result 


569 


579 


MISSING  INFORMATION  RESULTS 
(No  Treatment  A  vs.  B  results) 


AUTHOR 
Baran 

Miller  et  al. 
DuHoux 
Sasso  et  al. 
Miller  et  al. 
Simpson  et  al. 

Meehan 


Schroeder 


Rusalem 

/lustin  et  al. 
Spreen 


Prothero  &  Ehlers 
Dixon 

Naor  &  Milgram 
Parish  et  al. 


Haurasyndw  &  Home 
Ingram 

Aldridge 
Ashnoje 

Frith  &  Lindsey 
K^llace 
D'Zaniko 
Kroner 

Messinger-nevell 


Ihrelkeld  (t  DeJong 


YEAR 

£S  I  D# 

SITTT  TKif^cr 

EDLEvEXP 

UNIVMA JE 

OCCUPEXP 

C0MIN3TR 

ASSESTVF 

TRTECHE  TOThRSt 

LSAV 

1077 

101 
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